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(S4) Title: GENES AMPLIFIED IN CANCER CELLS 
(57) Abstract 

New methods arc disclosed for detecting cancer associated 
genes, and obtaining corresponding cDNA sequences. The 
methods involve supplying RNA preparations from control cells, 
and from a plurality of different cancer cells that share a 
duplicated or deleted gene in the same region of a chromosome. 
Amplified cDNA copies are displayed, and then selected based 
on differences in abundance of RNA between preparations. 
Optional additional screening steps involve surveying panels of 
cancer cells using the cDNA for RNA overabundance with or 
without gene duplication. The identified genes can be used 
in turn to develop materials and techniques for diagnosing and 
treating the underlying cancer. Four novel genes associated with 
cancer have been identified. In at least about 60 % of the breast 
cancer cell lines tested. RNA hybridizing with the cDNAs were 
substantially more abundant than in normal cells. Most of the 
cell lines also showed a duplication of the corresponding gene, 
which probably contributed to the increased level of RNA in the 
cell. However, for each of the four genes, there were some cell 
lines which had RNA overabundance without gene duplication. 
This suggests that the gene product is sufficiently important 
to the cancer process that cells will use several alternative 
mechanisms to achieve increased expression. 
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Genes Amplified in Cancer Cfi LS 

Priority ci aiv) 



This application claims the priority benefit of the following U.S. Patent applications: 
60/015.167. filed April 9. 1996; 60/019.202. filed June 6. 1996; 08/678.280. filed July 10. 1996. For 
purposes of prosecution in the U.S.. the aforementioned applications are hereby incorporated herein 
by reference in their entirety. 

Technicai Fir n 

The present invention relates generally to the field of human genetics. More specifically, it 
relates to the identification of novel genes associated with overabundance of RNA in human cancer 
such as breast cancer. It pertains especially to those genes and the products thereof which may be 
important in diagnosis and treatment. 

Background of thf invfwtiqn 

Cancer is a heterogeneous disease. It manifests itself in a wide variety of tissue sites, with 
different degrees of de-differentiation, invasiveness, and aggressiveness. Some forms of cancer 
are responsive to traditional modes of therapy, but many are not. For most common cancers, there 
is a pressing need to improve the arsenal of therapies available to provide more precise and more 
effective treatment in a less invasive way. 

As an example, breast cancer has an unsatisfactory morbidity and mortality, despite 
presently available forms of medical intervention. Traditional clinical initiatives are focused on early 
diagnosis, followed by surgery and chemotherapy. Such interventions are of limited success, 
particularly in patients where the tumor has undergone metastasis. 

The heterogeneous nature of cancer arises because different cancer cells achieve their 
growth and pathological properties by different phenotypic alterations. Alteration of gene 
expression is intimately related to the uncontrolled growth and de-differentiation that are hallmarks 
of cancer. Certain similar phenotypic alterations in turn may have a different genetic base in 
different tumors. Yet. the number of genes central to the malignant process must be a finite one. 
Accordingly, new pharmaceuticals that are tailored to specific genetic alterations in an individual 
tumor may be more effective. 

There are two types of altered gene expression that take place, together or independently, 
in different cancer cells (reviewed by Bishop). The first type is the decreased expression of 
recessive genes, known as tumor suppresser genes, that apparently act to prevent malignant 
growth. The second type is the increased expression of dominant genes, such as oncogenes, that 
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act to promote malignant growth, or to provide some other phenotype critical for malignancy. Thus, 
alteration in the expression of either type of gene is a potential diagnostic indicator. Furthermore, a 
treatment strategy might seek to reinstate the expression of suppresser genes, or reduce the 
expression of dominant genes. The present invention is directed to identifying genes of either type, 
particularly those of the second type. 

The most frequently studied mechanism for gene overexpression in cancer cells is 
sometimes referred to as amplification. This is a process whereby the gene is duplicated within the 
chromosomes of the ancestral cell into multiple copies. The process involves unscheduled 
replications of the region of the chromosome comprising the gene, followed by recombination of the 
replicated segments back into the chromosome (Alitalo et al.). As a result, 50 or more copies of 
the gene may be produced. The duplicated region is sometimes referred to as an -amplicon". The 
level of expression of the gene (that is. the amount of messenger RNA produced) escalates in the 
transformed cell in the same proportion as the number of copies of the gene that are made (Alitalo 
et al ). 

Several human oncogenes have been described, some of which are duplicated, for 
example, in a significant proportion of breast tumors. A prototype is the eroB2 gene (also known 
as HER-2/neu), which encodes a 185 kDa membrane growth factor receptor homologous to the 
epidermal growth factor receptor. erbB2 is duplicated in 61 of 283 tumors (22%) tested in a recent 
survey (Adnane et al.). Other oncogenes duplicated in breast cancer are the bek gene, duplicated 
20 in 34 out of 286 (12%); the fig gene, duplicated in 37 out of 297 (12%), the myc gene, duplicated in 
43 out of 275 (16%) (Adnane et al.). 

Work with other oncogenes, particularly those described for neuroblastoma, suggested that 
gene duplication of the proto-oncogene was an event involved in the more malignant forms of 
cancer, and could act as a predictor of clinical outcome (reviewed by Schwab et al. and Alitalo et 
25 al ). In breast cancer, duplication of the e/*B2 gene has been reported as correlating both with 
reoccurrence of the disease and decreased survival times (Slamon et al ). There is some evidence 
that ert>B2 helps identify tumors that are responsive to adjuvant chemotherapy with 
cyclophosphamide, doxorubicin, and fluorouracil (Muss et al.). 

It is clear that only a proportion of the genes that can undergo gene duplication in cancer 
30 have been identified. First, chromosome abnormalities, such as double minute (DM) chromosomes 
and homogeneously stained regions (HSRs), are abundant in cancer cells. HSRs are 
chromosomal regions that appear in karyotype analysis with intermediate density Giemsa staining 
throughout their length, rather than with the normal pattern of alternating dark and light bands 
They correspond to multiple gene repeats. HSRs are particularly abundant in breast cancers. 
35 showing up in 60-65% of tumors surveyed (Outrillaux et al.. Zafrani et al.). When such regions are 
checked by in situ hybridization with probes for any of 16 known human oncogenes, including 
e/*B2 and myc. only a proportion of tumors show any hybridization to HSR regions. Furthermore, 
only a proportion of the HSRs within each karyotype are implicated. 
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Second, comparative genomic hybridization (CGH) has revealed the presence of copy 
number increases in tumors, even in chromosomal regions outside of HSRs. CGH is a new 
method in which whole chromosome spreads are stained simultaneously with DNA fragments from 
normal cells and from cancer cells, using two different fluorochromes. The images are 
5 computer-processed for the fluorescence ratio, revealing chromosomal regions that have 
undergone amplification or deletion in the cancer cells (Kallioniemi et al. 1992). This method was 
recently applied to 15 breast cancer cell lines (Kallioniemi et al. 1994). DNA sequence copy 
number increases were detected in ail 23 chromosome pairs. 

Cloning the genes that undergo duplication in cancer is a formidable challenge. In one 
10 approach, human oncogenes have been identified by hybridizing with probes for other known 
growth-promoting genes, particularly known oncogenes in other species. For example, the erbB2 
gene was identified using a probe from a chemically induced rat neuroglioblastoma (Slamon et al ). 
Genes with novel sequences and functions will evade this type of search. In another approach, 
genes may be cloned from an area identified as containing a duplicated region by CGH method. 
15 Since CGH is able to indicate only the approximate chromosomal region of duplicated genes, an 
extensive amount of experimentation is required to walk through the entire region and identify the 
particular gene involved. 

Genes may also be overexpressed in cancer without being duplicated. Methods that rely 
on identification from genetic abnormalities necessarily bypass such genes. Increased expression 
20 can come about through a higher level of transcription of the gene; for example, by up-regulation of 
the promoter or substitution with an alternative promoter. It can also occur if the transcription 
product is able to persist longer in the cell; for example, by increasing the resistance to cytoplasmic 
RNase or by reducing the level of such cytoplasmic enzymes. Two examples are the epidermal 
growth factor receptor, overexpressed in 45% of breast cancer tumors (Klijn et at.), and the IGF-1 
25 receptor, overexpressed in 50-93% of breast cancer tumors (Berns et al ). In almost all cases, the 
overexpression of each of these receptors is by a mechanism other than gene duplication. 

One way of examining overexpression at the messenger RNA level is by subtractive 
hybridization. It involves producing positive and negative cDNA strands from two RNA 
preparations, and looking for cDNA which is not completely hybridized by the opposing preparation. 
30 This is a laborious procedure which has distinct limitations in cancer research. In particular, since 
each subtraction involves cDNA from only two cell populations at a time, it is sensitive to individual 
phenotypic differences due not just to the presence of cancer, but also through natural metabolic 
variations. 

Another way of examining overexpression at the messenger RNA level is by differential 
35 display (Liang etal. 1992a). In this technique, cDNA is prepared from only a subpopulation of each 
RNA preparation, and expanded via the polymerase chain reaction using primers of particular 
specificity. Similar subpopulations are compared across several RNA preparations by gel 
autoradiography for expression differences. In order to survey the RNA preparations entirely, the 
assay is repeated with a comprehensive set of PCR primers. The screening strategy more 
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effectively includes multiple positive and negative control samples (Sunday et al.). The method has 
recently been applied to breast cancer cell lines, and highlights a number of expression differences 
(Liang et al. 1992b; Chen et al., McKenzie et al., Watson et al. 1994 & 1996, Kocher et al.). By 
excising the corresponding region of the separating gel, it is possible to recover and sequence the 
cDNA. 

Despite the advancement provided by differential display, problems remain in terms of 
applying it in the search for new cancer genes. First, because this is a test for RNA levels, any 
phenotypic difference between cell lines constitute part of the recovered set, leading to a large 
proportion of "false positive" identifications , It has been found that cDNA for mitochondrial genes 
constitute a large proportion of the differentially expressed bands, and it consumes substantial 
resources to recover the sample and obtain a partial sequence in order to eliminate them. Second, 
false positive identifications are made for reasons attributed to multiple cDNA species and 
competition for the PCR primers by RNA species of different abundance (Oebouck). Third, 
differential display highlights high copy number mRNAs and shorter mRNAs (Bertioli et al., 
Yeatman et al.) , and may therefore miss critical cancer-associated transcripts when used as a 
survey technique. Fourth, a number of adjustments are made to gene expression levels when a 
cell undergoes malignant transformation or cultured in vitro. Most of these adjustments are 
secondary, and not part of the transformation process. Thus, even when a novel sequence is 
obtained from the differential display, it is far from certain that the corresponding gene is at the root 
of the disease process. 

An early step in developing gene-specific therapeutic approaches is the identification of 
genes that are more central to malignant transformation or the persistence of the malignant 
phenotype. 

Disclosure of the Invention 

It is an objective of this invention to provide a method for identifying and characterizing 
genes and gene products which are duplicated or associated with overabundant RNA in cancer 
cells. The method can be used for any type of cancer, providing a plurality of cell populations or 
cell lines of the type of cancer are available, in conjunction with a suitable control cell population. 
The method is highly effective in identifying genes and gene products that are intimately related to 
malignant transformation or maintenance of the malignant properties of the cancer cells. 

An important derivative of applying the method is the selection and retrieval of cDNA and 
cDNA fragments corresponding to the cancer-associated gene. These fragments can be used 
inter alia to determine the nucleotide sequence of the gene and mRNA, the amino acid sequence of 
any encoded protein, or to retrieve from a cDNA or genomic library additional polynucleotides 
related to the gene or its transcripts. Since the genes are typically involved in the malignant 
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process of the cell, the polynucleotides, polypeptides, and antibodies derived by using this method 
can in turn be used to design or screen important diagnostic reagents and therapeutic compounds. 

Another objective of this invention to provide isolated polynucleotides, polypeptides, and 
antibodies derived from four novel genes which are associated with several different types of cancer, 
5 including breast cancer. The genes are designated CH1-9a11-2. CH8-2a13-1. CH13-2a12-1, and 
CH14-2a16-1. These designations refer to both strands of the cDNA and fragments thereof; and to 
the respective corresponding messenger RNA, including splice variants, allelic variants, and 
fragments of any of these forms. These genes show RNA overabundance in a majority of cancer cell 
lines tested. A majority of the cells showing RNA overabundance also have duplication of the 
10 corresponding gene. Another object of this invention is to provide materials and methods based on 
these polynucleotides, polypeptides, and antibodies for use in the diagnosis and treatment of cancer, 
particularly breast cancer. 

Accordingly, one embodiment of this invention is an isolated polynucleotide comprising a 
linear sequence contained in a polynucleotide selected from the group consisting of CH1-9a11-2, 
15 CH&-2a13-1, CH13-2a12-1, and CH14-2a16-1. The linear sequence is contained in a duplicated 
gene or overabundant RNA in cancerous cells. The RNA may be overabundant due to gene 
duplication, increased RNA transcription or processing, increased RNA persistence, any combination 
thereof, or by any other mechanism, in a proportion of breast cancer cells. Preferably, the RNA is 
overabundant in at least about 20% of a representative panel of breast cancer cell lines, such as the 
20 panels listed herein; more preferably, it is overabundant in at least about 40% of the panel; even more 
preferably, it is overabundant in at least 60% or more of the panel. Preferably, the RNA is 
overabundant in at least about 5% of spontaneously occurring breast cancer tumors; more preferably, 
it is overabundant in at least about 10% of such tumors; more preferably, it is overabundant in at least 
about 20% of such tumors; more preferably, it is overabundant in at least about 30% of such tumors; 
25 even more preferably, it is overabundant in at least about 50% of such tumors. 

Preferably, a sequence of at least 10 nucleotides is essentially identical between the isolated 
polynucleotide of the invention and a cONA from CH1-9a11-2 t CH8-2a13-1. CH13-2a12-1. and CH14- 
2a16-1; more preferably, a sequence of at least about 15 nucleotides is essentially identical; more 
preferably, a sequence of at least about 20 nucleotides is essentially identical; more preferably, a 
sequence of at least about 30 nucleotides is essentially identical; more preferably, a sequence of at 
least about 40 nucleotides is essentially identical; even more preferably, a sequence of at least about 
70 nucleotides is essentially identical; still more preferably, a sequence of about 100 nucleotides or 
more is essentially identical. A further embodiment of this invention is an isolated polynucleotide 
comprising a linear sequence essentially identical to a sequence selected from the group consisting of 
SEQ. ID NO:15, SEQ. ID NO:18, SEQ. ID NO:21, SEQ. ID NO:23, SEQ. ID NO:26, SEQ. ID NO:29, 
SEQ. ID NO:31, SEQ. ID NO:33, and SEQ. ID NO:35. These embodiments include an isolated 
polynucleotide which is a DNA polynucleotide, an RNA polynucleotide, a polynucleotide probe, or a 
polynucleotide primer. 
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This invention also provides an isolated polypeptide comprising a sequence of amino acids 
essentia.lv identical to trie polypeptide encoded by or translated from a po.ynudeotide se.ected from 
the group consisting of CH1-9a11-2. CH8-2a13-1. CH13-2a12-1. and CH14-2a16-1. Preferably a 
sequence of at least about 5 amino acids is essentially identical between the polypeptide of this 
invention and that encoded by the polynucleotide.- more preferably, a sequence of at least about 10 
amino acids is essentially identical; more preferably, a sequence of at least 15 amino acids is 
essentially identical; even more preferably, a sequence of at least 20 amino acids is essentially 
identical; still more preferably, a sequence of about 30 amino acids or more is essentially identical 
Preferably, the polypeptide comprises a linear sequence of at least 15 amino acids essentially 
identical to a sequence encoded by said pofynucleotide. Another embodiment of this invention is a 
polypeptide comprising a linear sequence essentially identical to a sequence selected from the group 
consisting of SEQ. ID NO:17. SEQ. 10 NO:20. SEQ. ID NO:25. SEQ. ID NO:28. SEQ. ID NO30 
SEQ. ID NO:32, SEQ. ID NO:34; and SEQ. ID NO:37. 

A further embodiment of this invention is an antibody specific for a polypeptide embodied in 
this invention. This encompasses both monoclonal and isolated polyclonal antibodies. 

A further embodiment of this invention is a method of using the polynucleotides of this 
invention for detecting or measuring gene duplication in cancerous cells, especially but not limited to 
breast cancer cells, comprising the steps of reacting DNA contained in a clinical sample with a 
reagent comprising the polynucleotide, said clinical sample having been obtained from an individual 
suspected of having cancerous cells, and comparing the amount of complexes formed between the 
reagent and the ONA in the clinical sample with the amount of complexes formed between the 
reagent and DNA in a control sample. 

A further embodiment is a method of using the polynucleotides of this invention for detecting 
or measuring overabundance of RNA in cancerous cells, especially but not limited to breast cancer 
cells, comprising the steps of reacting RNA contained in a clinical sample with a reagent comprising 
the polynucleotide, said clinical sample having been obtained from an individual suspected of having 
cancerous cells; and comparing the amount of complexes formed between the reagent and the RNA 
in the clinical sample with the amount of complexes formed between the reagent and RNA in a control 
sample. 

Another embodiment of this invention is a diagnostic kit for detecting or measuring gene 
duplication or RNA overabundance in cells contained in an individual as manifest in a clinical sample, 
comprising a reagent and a buffer in suitable packaging, wherein the reagent comprises a 
polynucleotide of this invention. 

Another embodiment of this invention is a method of using a polypeptide of this invention for 
detecting or measuring specific antibodies in a clinical sample, comprising the steps of reacting 
antibodies contained in the clinical sample with a reagent comprising the polypeptide, said clinical 
sample having been obtained from an individual suspected of having cancerous cells, especially but 
not limited to breast cancer cells; and comparing the amount of complexes formed between the 



-6- 



WO 97/38085 



PCTAUS97/05930 



reagent and the antibodies in the clinical sample with the amount of complexes formed between the 
reagent and antibodies in a control sample. 

Another embodiment of this invention is a method of using an antibody of this invention for 
detecting or measuring altered protein expression in a clinical sample, comprising the steps of 
5 reacting a polypeptide contained in the clinical sample with a reagent comprising the antibody, said 
clinical sample having been obtained from an individual suspected of having cancerous cells, 
especially but not limited to breast cancer cells; and comparing the amount of complexes formed 
between the reagent and the polypeptide in the clinical sample with the amount of complexes formed 
between the reagent and a polypeptide in a control sample. Further embodiments of this invention 

10 are diagnostic kits for detecting or measuring a polypeptide or antibody present in a clinical sample, 
comprising a reagent and a buffer in suitable packaging, wherein the reagent respectively comprises 
either an antibody or a polypeptide of this invention. 

Yet another embodiment of this invention is a host cell transfected by a polynucleotide of this 
invention. A further embodiment of this invention is a method for using a polynucleotide for screening 

15 a pharmaceutical candidate, comprising the steps of separating progeny of the transfected host cell 
into a first group and a second group; treating the first group of cells with the pharmaceutical 
candidate; not treating the second group of cells with the pharmaceutical candidate; and comparing 
the phenotype of the treated cells with that of the untreated cells. 

This invention also embodies a pharmaceutical preparation for use in cancer therapy, 

20 comprising a polynucleotide or polypeptide embodied by this invention, said preparation being 
capable of reducing the pathology of cancerous cells, especially for but not limited to breast cancer 
cells. Further embodiments of this invention are methods for treating an individual bearing cancerous 
cells, such as breast cancer cells, comprising administering any of the aforementioned 
pharmaceutical preparations. 

25 Still another embodiment of this invention is a pharmaceutical preparation or active vaccine 

comprising a polypeptide embodied by this invention in an immunogenic form and a pharmaceutical^ 
compatible excipient A further embodiment is a method for treatment of cancer, especially but not 
limited to breast cancer, either prophylactically or after cancerous cells are present in an individual 
being treated, comprising administration of the aforementioned pharmaceutical preparation. 

30 Another series of embodiments of this invention relate to methods for obtaining cONA 

corresponding to a gene associated with cancer, comprising the steps of; a) supplying an RNA 
preparation from uncultured control cells; b) supplying RNA preparations from at least two different 
cancer cells; c) displaying cDNA corresponding to the RNA preparations of step a) and step b) 
such that different cONA corresponding to different RNA in each preparation are displayed 

35 separately; d) selecting cDNA corresponding to RNA that is present in greater abundance in the 
cancer cells of step b) relative to the control cells of step a); e) supplying a digested ONA 
preparation from control cells; f) supplying digested DNA preparations from at least two different 
cancer cells; g) hybridizing the cDNA of step d) with the digested DNA preparations of step e) and 
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step f); and h) further selecting cDNA from the cDNA of step d) corresponding to genes that are 
duplicated in the cancer cells of step f) relative to the control cells of step e). 

One or more enhancements may optionally be included in the methods of this invention, 
including the following: 

5 1 Cancer cells are preferably used for step b) that share a duplicated gene in the same 

region of a chromosome. If desired, the practitioner may test cancer cells beforehand 
to detect the duplication or deletion of chromosome regions; or cancer cell lines may 
be used that have already been characterized in this respect. 

2. A higher plurality of cancer cells are preferably used to provide DNA for step b) ( step f). 
1 0 or preferably both step b) and step 0. The use of three cancer cells is preferred over 

two; the use of four cancer cells is more preferred, about five cancer cells is still more 
preferred, about eight cancer cells is even more preferred. The cONA of each cancer 
cell population is displayed or hybridized separately, in accordance with the method. 

3. A higher plurality of control cells are preferably used to provide DNA for step a), step 
15 e), or preferably both step a) and step e). The use of two control cell populations is 

preferred; the use of three or more is even more preferred. Both proliferating and non- 
proliferating populations are preferably used, if available. 

4. The control cells are preferably supplied fresh from a tissue source, and are not 
cultured or transformed into a cell line. This is increasingly important when the control 

20 cell populations used in step a) is only one or two in number. Freshly obtained cancer 

cells may also be used as an alternative to cancer ceil lines, although this is less 
critical. 

5. An additional screening step is preferably conducted in which the cDNA corresponding 
to the putative cancer-associated gene is additionally hybridized with a digested 

25 mitochondrial DNA preparation, to eliminate mitochondrial genes. This screening step 

may be conducted before, between, subsequent to, or simultaneously with the other 
screening steps of the method. 

6. An additional screening step is preferably conducted in which RNA is supplied from a 
plurality of cancer cells, and one or preferably more control cell populations; the RNA is 

30 contacted with cDNA corresponding to the putative cancer-associated gene under 

conditions that permit formation of a stable duplex, and cDNA is selected 
corresponding to RNA that is present in greater abundance in a proportion of the 
cancer cells relative to the control cells. Preferably, the plurality of cancer cells is a 
panel of at least five, preferably at least ten cells. Preferably at least three, more 

35 preferably at least five of the cancer cells show greater abundance of RNA. Preferably 

at least one and preferably more of the cancer cells shows a greater abundance of 
RNA compared with control cells, but does not show duplication of the corresponding 
gene in step h) of the method. 
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Other embodiments of the invention are methods for obtaining cDNA corresponding to a 
gene that is deleted or underexpressed in cancer, comprising the steps of: a) supplying an RNA 
preparation from control cells; b) supplying RNA preparations from at least two different cancer 
cells that share a deleted gene in the same region of a chromosome; c) displaying cDNA 
5 corresponding to the RNA preparations of step a) and step b) such that different cDNA 
corresponding to different RNA in each preparation are displayed separately; and d) selecting 
cONA corresponding to RNA that is present in lower abundance in the cancer cells of step b) 
relative to the control cells of step a). Such methods typically comprise the following further steps: 
e) supplying a digested DNA preparation from control cells; f) supplying digested DNA 

10 preparations from at least two different cancer cells; g) hybridizing the cDNA of step d) with the 
digested DNA preparations of step e) and step 0; and h) further selecting cDNA from the cDNA of 
step d) corresponding to a gene that is deleted in the cancer cells of step f) relative to the control 
cells of step e). Such methods for identifying deleted or underexpressed genes may also comprise 
enhancements such as those described above. 

15 Additional embodiments of this invention are methods for characterizing cancer genes, 

comprising obtaining cDNA corresponding to a cancer-associated gene according to a method of 
this invention, particularly those highlighted above, and then sequencing the cDNA. Alternatively or 
in addition, the cDNA may be used to rescue additional polynucleotides corresponding to a cancer- 
associated gene from an mRNA preparation, or a cDNA or genomic DNA library. 

20 Additional embodiments of this invention are methods for screening candidate drugs for 

cancer treatment, comprising obtaining cDNA corresponding to a gene that is duplicated, 
overexpressed, deleted, or underexpressed in cancer, and comparing the effect of the candidate 
drug on a cell genetically altered with the cDNA or fragment thereof with the effect on a cell not 
genetically altered. 

25 Various embodiments of this invention may be employed in pursuit of any form of cancer 

for which suitable tissue sources are available. Cancers of particular interest include lung cancer, 
glioblastoma, pancreatic cancer, colon cancer, prostate cancer, hepatoma, myeloma, and breast 
cancer. 

30 BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 is a half-tone reproduction of an autoradiogram of a differential display experiment, in which 
radiolabeled cDNA corresponding to a subset of total messenger RNA in different cells are compared. 
This is used to select cDNA corresponding to particular RNA that are overabundant in breast cancer. 

35 

Figure 2 is a half-tone reproduction of an autoradiogram of electrophoresed DNA digests from a 
panel of breast cancer cell lines probed with a CH8-2a1 3-1 insert (Panel A) or a loading control (Panel 
B). 
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Figure 3 is a half-tone reproduction of an autoradiogram of eiectrophoresed total RNA from a panel of 
breast cancer cell lines probed with a CH8-2a1 3- 1 insert (Panel A) or a loading control (Panel B). 

5 Figure 4 is a half-tone reproduction of an autoradiogram of eiectrophoresed DNA digests from a 
panel of breast cancer cell lines probed with a CH13-2a12-1 insert. 

Figure 5 is a half-tone reproduction of an autoradiogram of eiectrophoresed total RNA from a panel of 
breast cancer cell lines probed with a CH13-2a12-1 insert. 

10 

Figure 6 is a map of cONA fragments obtained for the breast cancer associated genes CH1-9a1 1-2, 
CH8-2a13-1, CH13-2a12-1 and CH14-2a16-1. Regions of the fragments used to deduce sequence 
data listed in the application are indicated by shading. Nucleotide positions are numbered from the 
left-most residue for which double-strand sequence data has been obtained, which is not necessarily 
1 5 the 5' terminus of the corresponding message. 

Figure 7 is a listing of primers used for obtaining the cDNA sequence data for CH1-9a1 1-2. 
Figure 8 is a listing of cDNA sequence obtained for CH1-9a1 1-2. 

20 

Figure 9 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the DNA sequence of CH1-9a11-2 shown in Figure 8. The single-letter amino acid code is used. 
Stop codons are indicated by a dot (•). The upper panel shows the complete amino acid translation; 
the lower panel shows the predicted gene product protein sequence. A possible transmembrane 
25 region is indicated by underlining. 



Figure 10 is a listing of primers used for obtaining the cDNA sequence data for CH8-2a13-1 
Figure 11 is a listing of cDNA sequence obtained for CH8-2a13-1 . 

30 

Figure 12 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the DNA sequence of CH8-2a13-1 shown in Figure 11. The upper panel shows the complete amino 
acid translation; the lower pane! shows the predicted gene product protein sequence. 

35 Figure 13 is a listing of the nucleotide sequence predicted for a full-length CH8-2a1 3-1 cDNA. 

Figure 14 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the DNA sequence of CH8-2a13-1 shown in Figure 13. 
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Figure 15 is a listing of primers used for obtaining the cDNA sequence data for CH13-2a12-1. 
Figure 16 Is a listing of cDNA sequence obtained for CH13-2a12-1 . 

5 

Figure 1 7 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the ONA sequence of CH13-2a12-1 shown in Figure 16. The upper panel shows the complete amino 
acid translation; the lower panel shows the predicted gene product protein sequence. 

10 Figure 18 is a listing of primers used for obtaining cDNA sequence data for CH13-2a12-1„ 

Figure 19 is a listing of the cDNA sequence data obtained by two-directional sequencing for CH14- 
2a16-1. 

15 Figure 20 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the ONA sequence of CH14-2a16-1 shown in Figure 19. The upper panel shows the complete amino 
acid translation; the tower panel shows the predicted gene product protein sequence. Residues 
corresponding to three zinc finger motifs are underlined, indicating that the protein may have DNA or 
RNA binding activity. 

20 

Figure 21 is a listing of additional DNA sequence data towards the 5* end of CH14-2a16-1 obtained 
by one-directional sequencing of the fragment pCH14-1.3. First two panels show nucleotide and 
amino add sequence from the 5' end of the fragment; the second two panels show nucleotide and 
amino acid sequence from the 3' end of the fragment. Regions of overlap with pCH 14-600 are 
25 underlined. 

Figure 22 is a listing of the nucleotide sequences of initial fragments obtained corresponding to the 
four breast cancer associated genes, along with their amino acid translations. 

30 Figure 23 is a listing of additional cONA sequence obtained for CH1-9a11-2, comprising 
approximately 1934 base pairs 5' from the sequence of Figure 8. 

Figure 24 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the DNA sequence of CH1-9a11-2 shown in Figure 23. The single-letter amino acid code is used. 
35 Stop codons are indicated by a dot (•). 

Figure 25 is a listing of additional cDNA sequence obtained for CH14-2a16-1, comprising 
approximately 1934 base pairs 5' from the sequence of Figure 19. 
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Figure 26 is a listing of the amino acid sequence corresponding to the longest open reading frame of 
the ONA sequence of CH1-9a11-2 shown in Figure 25. The single-letter amino acid code is used. 
Stop codons are indicated by a dot (•). The upper panel shows the complete amino acid translation; 
5 the lower panel shows zthe predicted gene product protein sequence. 

PEST MOPE FOR Carrying Out the Invention 

This invention relates to the discovery and characterization of four novel genes associated 
10 with breast cancer. The cDNA of these genes, and their sequences as disclosed below, provide the 
basis of a series of reagents that can be used in diagnosis and therapy. 

Using a panel of about 15 cancer cell lines, each of the four genes was found to be duplicated 
in 40-60% of the cells tested. Surprisingly, each of the four genes was duplicated in at least one cell 
line where studies using comparative genomic hybridization had not revealed any amplification of the 
1 5 corresponding chromosomal region. 

Levels of expression at the mRNA level were tested in a similar panel for two of these four 
genes. In addition to those cell lines showing gene duplication, 17 to 37% of the lines showed RNA 
overabundance without gene duplication, indicating that the malignant cells had used some 
mechanism other than gene duplication to promote the abundance of RNA corresponding to these 
20 genes. All four of the breast cancer genes have open reading frames, and likely are transcribed at 
various levels in different cell types. Overabundance of the corresponding RNA in a cancerous cell is 
likely associated with overexpression of the protein gene product. Such overexpression may be 
manifest as increased secretion of the protein from the cell into blood or the surrounding environment, 
an increased density of the protein at the cell surface, or an increased accumulation the protein within 
25 the cell, in comparison to the typical level in noncancerous cells of the same tissue type. 

Different tumors bear different genotypes and phenotypes, even when derived from the same 
tissue. Gene therapy in cancer is more likely to be effective if it is aimed at genes that are involved in 
supporting the malignancy of the cancer. This invention discloses genes that achieve RNA 
overabundance by several mechanisms, because they are more likely to be directly involved in the 
30 pathogenic process, and therefore suitable targets for pharmacological manipulation. 

Features of the four novel genes, the respective mRNA, and the cDNA used to find them are 
provided in Table 1 
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. TABLE 1: 


Characteristics of 4 Novel Breast Cancer Genes 


Chromosome 


Designation 


mRNA 
Observed 


Exemplary cONA 
Fragments Cloned 


1 




CH1-9a11-2 


5.5kb, 4.5kb 


1 1 kh ? 5 kh 


■ , r, 




CH8-2a13-1 


4.2kb 


0.6 kb (two), 3.0 kb. 
4.0 kb 


13 




CH13-2a12-1 


3.5kb, 3.2kb 


1.6kb, 3.5 kb 


14 




CH14-2a16-1 


3.8kb, 3kb 


0.8 kb, 1.3kb,1.6kb, 2.5 
kb 



All four genes sequences are unrelated to other genes known to be overexpressed in breast 
cancer, including the erbB2 gene (Adnane et al), tissue factor (Chen et al), mammaglobulin (Watson 
et al.), and DD96 (Kocher et al.). 
5 The four mRNA sequences each comprise an open reading frame. The CH1-9a1 1-2 gene is 

expressed at the mRNA level at relatively elevated levels in pancreas and testis. The CH8-2a13-1 
gene is expressed at relatively elevated levels in adult heart, spleen, thymus, small intestine, colon, 
and tissues of the reproductive system; and at higher levels in certain tissues of the fetus. The CH13- 
2a 12-1 gene is expressed at relatively elevated leves in heart, skeletal muscle, and testis. The CH14- 

10 2a 16-1 gene is expressed at relatively elevated levels in testis. The level of expression of all four 
genes is especially high in a substantial proportion of breast cancer cell lines. 

The CH1-9a1 1-2 gene encodes a protein with a putative transmembrane region, and may be 
expressed as a surface protein on cancer cells. The CH13-2a12-1 gene is distantly related to a C. 
elegans gene implicated in cell cycle regulation, and may play a role in the regulation of cell 

15 proliferation. The protein encoded by CH13-2a12-1 is distantly related to a vasopressin-activated 
calcium binding receptor, and may have Ca** binding activity. The CH14-2a16-1 comprises at least 
five domains of a zinc finger binding motif and is distantly related to a yeast RNA binding protein. The 
CH14-2a16-1 gene product is suspected of having DNA or RNA binding activity, which may relate to a 
role in cancer pathogenesis. 

20 The four genes described here are exemplars of genes that undergo altered expression in 

cancer, identifiable using the gene screening methods of the invention. The method involves an 
analysis for both DNA duplication and altered RNA abundance relating to the same gene. Since 
abnormal gene regulation is central to the malignant process, the identification method may be 
brought to bear on any type of cancer. 

25 The screening method is superior to any previously available approach in several respects. 

Particularly significant is that screening is rapidly focused towards genes that are central to the 
malignant process, and away from those that have variable levels of expression as part of normal 
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metabolic processes. Furthermore, because the end-product is a cDNA corresponding to the 
gene, the process leads rapidly to detailed characterization of the gene, and any effector molecule 
rt may encode. This in turn leads to development of new diagnostic and therapeutic materials and 
techniques. 



Definitions 



Terms used in this application include the following: 

The term "polynucleotide" refers to a polymeric form of nucleotides of any length either 
deoxyribonucteotides or ribonucleotides, or analogs thereof. Polynucleotides may have any 
three-dimensional structure, and may perform any function, known or unknown. The following are 
non-limiting examples of polynucleotides: a gene or gene fragment, exons. introns. messenger RNA 
(mRNA), transfer RNA. ribosoma. RNA. ribozymes. cDNA, recombinant polynucleotides branched 
polynucleotides, plasmids. vectors, isolated DNA of any sequence, isolated RNA of any sequence 
nucleic acid probes, and primers. A polynucleotide may comprise modified nucleotides such as 
methylated nucleotides and nucleotide analogs. If present, modifications to the nucleotide structure 
may be imparted before or after assembly of the polymer. The sequence of nucleotides may be 
interrupted by non-nucleotide components. A polynucleotide may be further modified after 
polymerization, such as by conjugation with a labeling component 

The term polynucleotide, as used herein, refers interchangeably to double- and 
single-stranded molecules. Unless otherwise specified or required, any embodiment of the invention 
described herein that is a polynucleotide encompasses both the double-stranded form, and each of 
two complementary single-stranded forms known or predicted to make up the double-stranded form. 

In the context of polynucleotides, a "linear sequence" or a "sequence" is an order of 
nucleotides in a polynucleotide in a 5' to 3' direction in which residues that neighbor each other in the 
sequence are contiguous in the primary structure of the polynucleotide. A "partial sequence" is a 
linear sequence of part of a polynucleotide which is known to comprise additional residues in one or 
both directions. 

"Hybridization" refers to a reaction in which one or more polynucleotides react to form a 
complex that is stabilized via hydrogen bonding between the bases of the nucleotide residues. The 
hydrogen bonding is sequence-specific, and typical:/ occurs by Watson-Crick base pairing. A 
hybridization reaction may constitute a step in a more extensive process, such as the initiation of a 
PCR, or the enzymatic cleavage of a polynucleotide by a ribozyme. 

Hybridization reactions can be performed under conditions of different "stringency". Relevant 
conditions include temperature, ionic strength, time of incubation, the presence of additional solutes in 
the reaction mixture such as formamide. and the washing procedure Higher stringency conditions 
are those conditions, such as higher temperature and lower sodium ion concentration, which require 
higher minimum complementarity between hybridizing elements for a stable hybridization complex to 
form. Conditions that increase the stringency of a hybridization reaction are widely known and 
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published in the art: see, for example. 'Molecular Cloning: A Laboratory Manual", Second Edition 
(Sambrook, Fritsch & Maniatis, 1989). 

When hybridization occurs in an antiparallel configuration between two single-stranded 
polynucleotides, those polynucleotides are described as "complementary. A double-stranded 
polynucleotide can be "complementary" to another polynucleotide, if hybridization can occur between 
one of the strands of the first polynucleotide and the second. Complementarity (the degree that one 
polynucleotide is complementary with another) is quantifiable in terms of the proportion of bases in 
opposing strands that are expected to form hydrogen bonding with each other, according to generally 
accepted base-pairing rules. 

A linear sequence of nucleotides is "identical" to another linear sequence, if the order of 
nucleotides in each sequence is the same, and occurs without substitution, deletion, or material 
substitution. It is understood that purine and pyrimidine nitrogenous bases with similar structures can 
be functionally equivalent in terms of Watson-Crick base-pairing; and the inter-substitution of like 
nitrogenous bases, particularly uracil and thymine, or the modification of nitrogenous bases, such as 
by methylation. does not constitute a material substitution. An RNA and a DNA potynucleotide have 
identical sequences when the sequence for the RNA reflects the order of nitrogenous bases in the 
polyribonucleotides, the sequence for the DNA reflects the order of nitrogenous bases in the 
polyribonucleotides, and the two sequences satisfy the other requirements of this definition. 
Where one or both of the polynucleotides being compared is double-stranded, the sequences are 
identical if one strand of the first pofynucleotide is identical with one strand of the second 
polynucleotide. 

A linear sequence of nucleotides is "essentially identical" to another linear sequence, if both 
sequences are capable of hybridizing to form a duplex with the same complementary polynucleotide. 
Sequences that hybridize under conditions of greater stringency are more preferred. It is understood 
that hybridization reactions can accommodate insertions, deletions, and substitutions in the nucleotide 
sequence. Thus, linear sequences of nucleotides can be essentially identical even if some of the 
nucleotide residues do not precisely correspond or align. In general, essentially identical sequences 
of about 40 nucleotides in length will hybridize at about 300C in 10 x SSC (0.15 M NaCI. 15 mM 
citrate buffer); preferably, they will hybridize at about 400C in 6 x SSC; more preferably, they will 
hybridize at about 500C in 6 x SSC; even more preferably, they will hybridize at about 600C in 6 x 
SSC. or at about 400C in 0.5 x SSC. or at about 300C in 6 x SSC containing 50% formamide; still 
more preferably, they will hybridize at 400C or higher in 2 x SSC or lower in the presence of 50% or 
more formamide. It is understood that the rigor of the test is partly a function of the length of the 
polynucleotide; hence shorter polynucleotides with the same homology should be tested under lower 
stringency and longer polynucleotides should be tested under higher stringency, adjusting the 
conditions accordingly. The relationship between hybridization stringency, degree of sequence 
identity, and polynucleotide length is known in the art and can be calculated by standard formulae, 
see. e.g.. Meinkoth et al. Sequences that conespond or align more closely to the invention disclosed 
herein are comparably more preferred. Generally, essentially identical sequences are at least about 
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50% identical with each other, after alignment of the homologous regions. Preferably, the sequences 
are at least about 60% identical; more preferably, they are at least about 70% identical; more 
preferably, they are at least about 80% identical; more preferably, the sequences are at least about 
90% identical; even more preferably, they are at least 95% identical; still more preferably, the 
sequences are 100% identical. Percent identity is calculated as the percent of residues in the 
sequence being compared that are identical to those in the reference sequence, which is usually one 
of those listed or described in this application, unless stated otherwise. No penalty is imposed for 
introduction of gaps in the reference or comparison sequence for purposes of alignment, but the 
resulting fragments must be rationally derived - small gaps may not be introduced to trivially improve 
the identity score. 

In determining whether polynucleotide sequences are essentially identical, a sequence that 
preserves the functionality of the polynucleotide with which it is being compared is particularly 
preferred. Functionality may be established by different criteria, such as ability to hybridize with a 
target polynucleotide, and whether the polynucleotide encodes an identical or essentially identical 
polypeptides Thus, nucleotide substitutions which cause a non-conservative substitution in the 
encoded polypeptide are preferred over nucleotide substitutions that create a stop codon; nucleotide 
substitutions that cause a conservative substitution in the encoded polypeptide are more preferred, 
and identical nucleotide sequences are even more preferred. Insertions or deletions in the 
polynucleotide that result in insertions or deletions in the polypeptide are preferred over those that 
result in the down-stream coding region being rendered out of phase. The relative importance of 
hybridization properties and the polypeptide encoded by a polynucleotide depends on the application 
of the invention. 

A "reagent" polynucleotide, polypeptide, or antibody, is a substance provided for a reaction, 
the substance having some known and desirable parameters for the reaction. A reaction mixture may 
also contain a "larger, such as a polynucleotide, antibody, or polypeptide that the reagent is capable 
of reacting with. For example, in some types of diagnostic tests, the amount of the target in a sample 
is determined by adding a reagent, allowing the reagent and target to react, and measuring the 
amount of reaction product. In the context of clinical management, a "target* may also be a cell, 
collection of cells, tissue, or organ that is the object of an administered substance, such as a 
pharmaceutical compound. 

"cDNA" or "complementary DNA" is a single- or double-stranded ONA polynucleotide in which 
one strand is complementary to a messenger RNA. "Full-length cDNA" is cDNA comprised of a strand 
which is complementary to an entire messenger RNA molecule. A "cONA fragment" as used herein 
generally represents a sub-region of the full-length form, but the entire full-length cONA may also be 
included. Unless explicitly specified, the term cDNA encompasses both the full-length form and the 
fragment form. 

Different polynucleotides are said to "correspond" to each other if one is ultimately derived 
from another. For example, messenger RNA conesponds to the gene from which it is transcribed. 
cDNA corresponds to the RNA from which it has been produced, such as by a reverse transcription 
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reaction, or by chemical synthesis of a DNA based upon knowledge of the RNA sequence. cDNA 
also corresponds to the gene that encodes the RNA. Polynucleotides may be said to correspond 
even when one of the pair is derived from only a portion of the other. 

A "probe" when used in the context of polynucleotide manipulation refers to a polynucleotide 
5 which is provided as a reagent to detect a target potentially present in a sample of interest by 
hybridizing with the target. Usually, a probe will comprise a label or a means by which a label can be 
attached, either before or subsequent to the hybridization reaction. Suitable labels include, but are not 
limited to radioisotopes, fluorochromes, chemiluminescent compounds, dyes, and enzymes. 

A "primer" is a short polynucleotide, generally with a free 3* -OH group, that binds to a target 

10 potentially present in a sample of interest by hybridizing with the target, and thereafter promoting 
polymerization of a polynucleotide complementary to the target. A "polymerase chain reaction" 
("PCR") is a reaction in which replicate copies are made of a target polynucleotide using one or more 
primers, and a catalyst of polymerization, such as a reverse transcriptase or a DNA polymerase, and 
particularly a thermally stable polymerase enzyme. Methods for PCR are taught in U.S. Patent Nos. 

15 4,683,195 (Mullis) and 4,683,202 (Mullis et al.). All processes of producing replicate copies of the 
same polynucleotide, such as PCR or gene cloning, are collectively referred to herein as "replication." 

An "operon" is a genetic region comprising a gene encoding a protein and functionally related 
5' and 3' flanking regions. Elements within an operon include but are not limited to promoter regions, 
enhancer regions, repressor binding regions, transcription initiation sites, ribosome binding sites, 

20 translation initiation sites, protein encoding regions, introns and exons, and termination sites for 
transcription and translation. A "promoter' is a DNA region capable under certain conditions of 
binding RNA polymerase and initiating transcription of a coding region located downstream (in the 3' 
direction) from the promoter. "Operably linked" refers to a juxtaposition of genetic elements, wherein 
the elements are in a relationship permitting them to operate in the expected manner. For instance, a 

25 promoter is operably linked to a coding region if the promoter helps initiate transcription of the coding 
sequence. There may be intervening residues between the promoter and coding region so long as 
this functional relationship is maintained. 

"Gene duplication" is a term used herein to describe the process whereby an increased 
number of copies of a particular gene or a fragment thereof is present in a particular cell or cell line. 

30 "Gene amplification" generally is synonymous with gene duplication. 

"Expression" is defined alternately in the scientific literature either as the transcription of a 
gene into an RNA polynucleotide, or as the transcription and subsequent translation into a 
polypeptide. As used herein, "expression" or "gene expression" generally refers to the production of 
the RNA unless specified or required otherwise. Thus, "RNA overexpression" reflects the presence of 

35 more RNA (as a proportion of total RNA) from a particular gene in a cell being described, such as a 
cancerous cell, in relation to that of the cell it is being compared with, such as a non-cancerous cell. 
The protein product of the gene may or may not be produced in normal or abnormal amounts. 
"Protein overexpression" similarly reflects the presence of relatively more protein present in or 
produced by, for example, a cancerous cell. 
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"Abundance" of RNA refers to the amount of a particular RNA present in a particular cell type. 
Thus, "RNA overabundance" or "overabundance of RNA" describes RNA that is present in greater 
proportion of total RNA in the cell type being described, compared with the same RNA as a proportion 
of the total RNA in a control cell. A number of mechanisms may contribute to RNA overabundance in 
5 a particular cell type: for example, gene duplication, increased level of transcription of the gene, 
increased persistence of the RNA within the cell after it is produced, or any combination of these. 
Similarly, "lower abundance* or "underabundance" describes RNA that is present in lower 
proportion in the cell being described compared with a control cell. 

The terms "polypeptide", "peptide" and "protein" are used interchangeably herein to refer to 
10 polymers of amino acids of any length. The polymer may be linear or branched, it may comprise 
modified amino acids, and it may be interrupted by non-amino adds. The terms also encompass an 
amino acid polymer that has been modified; for example, disulfide bond formation, glycosylation, 
lipidation, acetylation, phosphorylation, or any other manipulation, such as conjugation with a labeling 
component 

1 5 ,n ^e context of polypeptides, a "linear sequence" or a "sequence" is an order of amino acids 

in a polypeptide in an N-terminal to C-terminal direction in which residues that neighbor each other in 
the sequence are contiguous in the primary structure of the polypeptide. A "partial sequence" is a 
linear sequence of part of a polypeptide which is known to comprise additional residues in one or both 
directions. 

20 A linear sequence of amino acids is "essentially identical" to another sequence rf the two 

sequences have a substantial degree of sequence identity. It is understood that the functional 
proteins can accommodate insertions, deletions, and substitutions in the amino acid sequence. Thus, 
linear sequences of amino acids can be essentially identical even if some of the residues do not 
precisely correspond or align. Sequences that correspond or align more closely to the invention 

25 disclosed herein are more preferred. It is also understood that some amino acid substitutions are 
more easily tolerated. For example, substitution of an amino acid with hydrophobic side chains, 
aromatic side chains, polar side chains, side chains with a positive or negative charge, or side chains 
comprising two or fewer carbon atoms, by another amino acid with a side chain of like properties can 
occur without disturbing the essential identity of the two sequences. Methods for determining 

30 homologous regions and scoring the degree of homology are well known in the art; see for example 
Altschul et al. and Henikoff et at. Well-tolerated sequence differences are referred to as "conservative 
substitutions". Thus, sequences with conservative substitutions are preferred over those with other 
substitutions in the same positions; sequences with identical residues at the same positions are still 
more preferred. In general, amino acid sequences that are essentially identical are at least about 

35 1 5% identical, and comprise at least about another 15% which are either identical or are conservative 
substitutions, after alignment of homologous regions. More preferably, essentially identical 
sequences comprise at least about 50% identical residues or conservative substitutions; more 
preferably, they comprise at least about 70% identical residues or conservative substitutions; more 
preferably, they comprise at least about 80% identical residues or conservative substitutions; more 
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preferably, they comprise at least about 90% identical residues or conservative substitutions; more 
preferably, they comprise at least about 95% identical residues or conservative substitutions; even 
more preferably, they contain 100% identical residues. 

In determining whether polypeptide sequences are essentially identical, a sequence that 
5 preserves the functionality of the polypeptide with which it is being compared is particularly preferred. 
Functionality may be established by different parameters, such as enzymatic activity, the binding rate 
or affinity in a receptoMigand interaction, the binding affinity with an antibody, and X-ray 
crystallographic structure. 

An "antibody" (interchangeably used in plural form) is an immunoglobulin molecule capable of 

10 specific binding to a target, such as a polypeptide, through at least one antigen recognition site, 
located in the variable region of the immunoglobulin molecule. As used herein, the term 
encompasses not only intact antibodies, but also fragments thereof, mutants thereof, fusion proteins, 
humanized antibodies, and any other modified configuration of the immunoglobulin molecule that 
comprises an antigen recognition site of the required specificity. 

15 The term "antigen" refers to the target molecule that is specifically bound by an antibody 

through its antigen recognition site. The antigen may, but need not be chemically related to the 
tmmunogen that stimulated production of the antibody. The antigen may be polyvalent, or it may be a 
monovalent hapten. Examples of kinds of antigens that can be recognized by antibodies include 
polypeptides, polynucleotides, other antibody molecules, oligosaccharides, complex lipids, drugs, and 

20 chemicals. An "immunogen" is an antigen capable of stimulating production of an antibody when 
injected into a suitable host, usually a mammal. Compounds may be rendered immunogenic by many 
techniques known in the art, including crosslinking or conjugating with a carrier to increase valency, 
mixing with a mitogen to increase the immune response, and combining with an adjuvant to enhance 
presentation. 

25 An "active vaccine" is a pharmaceutical preparation for human or animal use. which is used 

with the intention of eliciting a specific immune response. The immune response may be either 
humoral or cellular, systemic or secretory. The immune response may be desired for experimental 
purposes, for the treatment of a particular condition, for the elimination of a particular substance, or for 
prophylaxis against a particular condition or substance. 

30 A" isolated" polynucleotide, polypeptide, protein, antibody, or other substance refers to a 

preparation of the substance devoid of at least some of the other components that may also be 
present where the substance or a similar substance naturally occurs or is initially obtained from. 
Thus, for example, an isolated substance may be prepared by using a purification technique to enrich 
it from a source mixture. Enrichment can be measured on an absolute basis, such as weight per 

35 volume of solution, or it can be measured in relation to a second, potentially interfering substance 
present in the source mixture. Increasing enrichments of the embodiments of this invention are 
increasingly more preferred. Thus, for example, a 2-fold enrichment is preferred, 10-fold enrichment 
is more preferred, 100-fold enrichment is more preferred, 1000-fold enrichment is even more 
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preferred. A substance can a.so be provided in an isolated state by a process of artificial assembly 
such as by chemical synthesis or recombinant expression. 

A polynucleotide used in a reaction, such as a probe used in a hybridization reaction a primer 
used ,n a PGR. or a polynucleotide present in a pharmaceutical preparation, is referred to as'specific" 
or ■se.ective" if it hybridizes or reacts with the intended target more frequently, more rapidfy or wfth 
greater duration than it does with alternate substances. Simitarty. an antibody is referred to as 
"specific" or "selective" if it binds via at leas, one antigen recognition site to the intended target more 
frequently, more rapid.y. or with greater duration than it does to alternate substances A 
polynucleotide or antibody is said ,o "setective* inhibit" or "selective* interfere wfth" a reaction if it 
-nh.b.ts or interferes with the reaction between particu,ar substrates to a greater degree or for a 
greater duration than it does with the reaction between alternative substrates. An antibody is capable 
of specifically deling" a substance if it conveys or retains that substance near a particular cell type 
more frequently or for a greater duration compared with other cell types 

The "effector component" of a Pharmaceutic^ 
target cells by altering their function in a desirable way when administered to a subject bearing the 
cells. Some advanced pharmaceutical preparations also have a "targeting component", such as an 
antibody, which helps deliver the effector component more efficadous.y to the targe, site. Depending 
on the desired action, the effector component may have any one of a number of modes of action For 
example, it may restore or enhance a normal function of a cell, it may eliminate or suppress an 
abnormal function of a cell, or it may alter a cell s phenotype. Attentively, ft may Wll or render 
dormant a cel. with patho«og,ca. features, such as a cancer ce, Examp.es of effector components are 
provided in a later section. 

A "pharmaceutics candidate" or "drug candidate" is a compound believed to have therapeutic 
potential, that is to be tested for efficacy. The "screening" of a pharmaceutical candidate refers to 
conducting an assay that is capab,e of evaluating the efficacy and/or specificity of the candidate In 
th,s context, "efficacy" refers to the abftity of the candidate to effect the cel. or organism it is 
adm.n.stered to in a beneficial way: for example, the limitation of the pathology of cancerous cells 

A "cel. line" or "cel. culture" denotes higher eukaryotic cells grown or maintained in vftro lt is 
understood that the descendants of a cell may not be complete* identical (either morphotogically 
genotypically. or phenotypically) to the parent cell. Cells described as "uncultured" are obtained 
d.rect.y from a living organism, and have been maintained for a limited amount of time away from the 
organ.sm: not long enough or under conditions for the cells to undergo substantial replication 

"Genetic alteration" refers to a process wherein a genetic element is introduced into a cell 
other than by mitosis or meiosis. The element may be heterologous to the cell, or it may be an 
additional copy or improved version of an element already present in the cel.. Genetic alteration 
may be effected, for example, by transfecting a cel. with a recombinant plasmid or other 
po.ynuc.eotide through any process known in the art. such as electroporation. calcium phosphate 
preoption, or contacting with a polynucleotide-liposome complex, or by transduction or infection 
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with a DNA or RNA virus or viral vector. The alteration is preferably but not necessarily inheritable 
by progeny of the altered cell. 

A "host cell" is a cell which has been genetically altered, or is capable of being genetically 
altered, by administration of an exogenous polynucleotide. 
5 The terms "cancerous cell" or 'cancer cell", used either in the singular or plural form, refer to 

cells that have undergone a malignant transformation that makes them pathological to the host 
organism. Malignant transformation is a single- or multi-step process, which involves in part an 
alteration in the genetic makeup of the cell and/or the expression profile. Malignant transformation 
may occur either spontaneously, or via an event or combination of events such as drug or chemical 
10 treatment, radiation, fusion with other cells, viral infection, or activation or inactivation of particular 
genes. Malignant transformation may occur in vivo or in vitro, and can if necessary be experimentally 
induced. 

A frequent feature of cancer cells is the tendency to grow in a manner that is uncontrollable 
by the host, but the pathology associated with a particular cancer cell may take another form, as 
outlined infra. Primary cancer cells (that is, cells obtained from near the site of malignant 
transformation) can be readily distinguished from non-cancerous cells by well-established techniques, 
particularly histological examination. The definition of a cancer cell, as used herein, includes not only 
a primary cancer cell, but any cell derived from a cancer cell ancestor. This includes metastasized 
cancer cells, and in vitro cultures and cell lines derived from cancer cells. 

The "pathology" caused by a cancer cell within a host is anything that compromises the 
well-being or normal physiology of the host This may involve (but is not limited to) abnormal or 
uncontrollable growth of the cell, metastasis, release of cytokines or other secretory products at an 
inappropriate level, manifestation of a function inappropriate for its physiological milieu, interference 
with the normal function of neighboring cells, aggravation or suppression of an inflammatory or 
25 immunological response, or the harboring of undesirable chemical agents or invasive organisms. 

Treatment" of an individual or a cell is any type of intervention in an attempt to alter the 
natural course of the individual or cell. For example, treatment of an individual may be undertaken to 
decrease or limit the pathology caused by a cancer cell harbored in the individual. Treatment includes 
(but is not limited to) administration of a composition, such as a pharmaceutical (Composition, and may 
be performed either prophylactically, or subsequent to the initiation of a pathologic event or contact 
with an etiologic agent. Effective amounts used in treatment are those which are sufficient to 
produce the desired effect, and may be given in single or divided doses. 

A "control cell - is an alternative source of cells or an alternative cell line used in an experiment 
for comparison purposes. Where the purpose of the experiment is to establish a base line for gene 
35 copy number or expression level, it is generally preferable to use a control cell that is not a cancer 
cell. 

The term "cancer gene" as used herein refers to any gene which is yielding transcription or 
translation products at a substantially altered level or in a substantially altered form in cancerous cells 
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compared with non-cancerous cells, and which may play a role in supporting the malignancy of the 
cell. It may be a normally quiescent gene that becomes activated (such as a dominant 
proto-oncogene). it may be a gene that becomes expressed at an abnormally high level (such as a 
growth factor receptor), it may be a gene that becomes mutated to produce a variant phenotype. or it 
may be a gene that becomes expressed at an abnormally low level (such as a tumor suppresser 
gene). The present invention is directed towards the discovery of genes in all these categories. 

It is understood that a "clinical sample' encompasses a variety of sample types obtained from 
a subject and useful in an in vitro procedure, such as a diagnostic test. The definition encompasses 
solid tissue samples obtained as a surgical removal, a pathology specimen, or a biopsy specimen, 
tissue cultures or cells derived therefrom and the progeny thereof, and sections or smears prepared 
• from any of these sources. Non-limiting examples are samples obtained from breast tissue, lymph 
nodes, and tumors. The definition also encompasses blood, spinal fluid, and other liquid sample of 
biologic origin, and may refer to either the cells or cell fragments suspended therein, or to the liquid 
medium and its solutes. 

The term -relative amount" is used where a comparison is made between a test 
measurement and a control measurement. Thus, the relative amount of a reagent forming a complex 
in a reaction is the amount reacting with a test specimen, compared with the amount reacting with a 
control specimen. The control specimen may be run separately in the same assay, or it may be part 
of the same sample (for example, normal tissue surrounding a malignant area in a tissue section) 

A "differential" result is generally obtained from an assay in which a comparison is made 
between the findings of two different assay samples, such as a cancerous cell line and a control cell 
line. Thus, for example, "differential expression" is observed when the level of expression of a 
particular gene is higher in one cell than another. "Differential display" refers to a display of a 
component, particularly RNA. from different cells to determine if there is a difference in the level of the 
component amongst different cells. Differential display of RNA is conducted, for example, by selective 
production and display of cDNA corresponding thereto. A method for performing differential display is 
provided in a later section. 

A polynucleotide derived from or corresponding to CH1-9a11-2, CH8-2a13-1, CH13-2a12-1, 
or CH14-2a16-1 is any of the following: the respective cONA fragments, the corresponding 
messenger RNA, including splice variants and fragments thereof, both strands of the corresponding 
full-length cDNA and fragments thereof, and the corresponding gene. Isolated allelic variants of any 
of these forms are included. This invention embodies any polynucleotide corresponding to CH1-9a1 1- 
2. CH8-2a13-1. CH13-2312-1. or CH14-2a16-1 in an isolated form. It also embodies any such 
polynucleotide that has been cloned or transfected into a cell line. 

When used in referring to the gene screening methods of this invention (such as those 
outlined in the last paragraph), "displaying cDNA" is any technique in which DNA copies of RNA 
(not restricted to mRNA) is rendered detectable in a quantitative or relatively quantitative fashion, in 
that DNA copies present in a relatively greater amount in a first sample compared with a second 
sample generates a relatively stronger or weaker signal compared with that of the second sample 



WO 97/38085 



PCT7US97/05930 



due to the difference in copy number. Separate display of different cDNA in a preparation 
(particularly but not limited to cONA of different size) allows comparison of levels of a particular 
cDNA between different samples. A preferred method of display is the differential display 
technique, and enhancements thereupon described in this disclosure and elsewhere. 
5 The term "digested" DNA encompasses DNA (particularly chromosomal DNA) that has 

been fragmented by any suitable chemical or enzymatic means into fragments conveniently 
separable by standard techniques, particularly gel electrophoresis. Digestion with a restriction 
endonuclease specific for a particular nucleotide sequence is preferred. 

"Hybridizing" in this context refers to contacting a first polynucleotide with a second 

10 polynucleotide under conditions that permit the formation of a multi-stranded polynucleotide duplex 
whenever one strand of the first polynucleotide has a sequence of sufficient complementarity to a 
sequence on the second polynucleotide. The duplex may be a long-lived one, such as when one 
DNA molecule is used as a labeled probe to detect another DNA molecule, that may optionally be 
bound to a nitrocellulose filter or present in a separating gel. The duplex may also be a shorter- 

15 lived one, such as when one DNA molecule is used to prime an amplification reaction of the other 
DNA molecule, and the amplified product is subsequently detected. The practitioner may alter the 
conditions of the reaction to alter the degree of complementarity required, as long as sequence 
specificity remains a determining factor in the reaction. 

Unless explicitly indicated or otherwise required by the techniques used, the steps of a 

20 method of this invention may be performed in any order, or combined where desired and 
appropriate. In one example, in the method comprising steps a) through h) that is described 
above, it is entirely appropriate to conduct steps a) to c) of the method either before or after steps 
e) to g) of the method, as long as the cDNA ultimately selected fulfills the criteria of both steps d) 
and step h). In another example, screening against different digested DNA preparations, even if 

25 outlined separately, may optionally be done at the same time. All permutations of this kind are 
within the scope of the invention. 

General methods 

30 The practice of the present invention will employ, unless otherwise indicated, conventional 

techniques of molecular biology, microbiology, recombinant ONA, and immunology, which are within 
the skill of the art. Such techniques are explained fully in the literature. See, for example, "Molecular 
Cloning: A Laboratory Manual", Second Edition (Sambrook. Fritsch & Maniatis, 1989), 
"Oligonucleotide Synthesis" (M.J. Gait, ed. t 1984), "Animal Cell Culture" (R.I. Freshney, ed.. 1987); 

35 the series 'Methods in Enzymology" (Academic Press, Inc.); "Handbook of Experimental Immunology" 
(D.M. Weir & C.C. Blackwell. Eds ), "Gene Transfer Vectors for Mammalian Cells" (J.M. Miller & M P. 
Calos, eds., 1987), 'Current Protocols in Molecular Biology" (F.M. Ausubel et al, eds., 1987); and 
"Current Protocols in Immunology" (J.E. Coligan et al., eds., 1991). All patents, patent applications, 
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articles and publications mentioned herein, both supra and infra, are hereby incorporated herein by 
reference. 

Features of the cancer gene screening method 

5 

The cancer gene screening methods of this invention may be brought to bear to discover 
novel genes associated with cancer. Exemplars of cancer-associated genes identified by this 
method are described below. The exemplars were identified using breast cancer cell lines and 
tissue, but the strategy can be applied to any cancer type of interest. 

10 A central feature of the cancer gene screening method of this invention is to look for both 

DNA duplication and RNA overabundance relating to the same gene. This feature is particularly 
powerful in the discovery of new and potentially important cancer genes. While amplicons occur 
frequently in cancer, the presently available techniques indicate only the broad chromosomal 
region involved in the duplication event, not the specific genes involved. The present invention 

15 provides a way of detecting genes that may be present in an amplicon from a functional basis. 
Because an early part of the method involves detecting RNA, the method avoids genes that may 
be duplicated in an amplicon but are quiescent (and therefore irrelevant) in the cancer cells. 
Furthermore, it recruits active genes from a duplicated region of the chromosome too small to be 
detectable by the techniques used to describe amplicons. 

20 Near the heart of this approach are several concepts. One is that genes encoding 

products implicated positively in the malignant process achieve elevated gene expression as a part 
of malignant transformation. In this context, "gene expression" refers to expression at the RNA 
transcription level. Most typically, the RNA is in turn be translated into a protein with a particular 
enzymatic, binding, or regulatory activity which increases after malignant transformation. In a less 

25 common example, the RNA may encode or participate as a ribozyme, antisense polynucleotide, or 
other functional nucleic acid molecule during malignancy. In a third example. RNA expression may 
be incidental but symptomatic of an important event in transformation. 

Another concept is that overexpression, if central to malignant transformation, may be 
achieved in different tumors by different mechanisms, and that at least one such possible 

30 mechanism is gene duplication. Accordingly, a substantial proportion of transformed cells will have 
an amplicon, or duplicated region of a chromosome, that includes within its compass the 
overexpressed gene. Other transformed cells may achieve RNA overabundance without gene 
duplication, such as by increasing the rate of transcription of the gene (e.g., by upregulation of the 
promoter region), by enhancing transcript promotion or transport, or by increasing mRNA survival. 

35 Thus, the method entails screening at the RNA level, several cancer cell lines or tumors, 

and several normal cell lines or tissue samples at the same time. RNA are selected that show a 
consistent elevation amongst the cancer cells as compared with normal cells. Additional strategies 
may be employed in combination with the RNA screening to improve the success rate of the 
method. One such strategy is to use several cancer cell lines that are all known to have duplicated 
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genes in the same region of a particular chromosome. Thus, the RNA that emerge from the screen 
are more likely to represent a deliberate overexpression event, and the overexpressed gene is 
likely to be within the duplicated region. A supplemental strategy is to use freshly prepared tissue 
samples rather than cell lines as controls for base-line expression. This avoids selection of genes 
5 that may alter their expression level just as a result of tissue culturing. Another supplemental 
strategy is to conduct an additional level of screening, following identification of shared, 
overexpressed RNA. The selected RNA are used to screen DNA from suitable cancer cells and 
normal cells, to ensure that at least a proportion of the cells achieved the overexpression by way of 
gene duplication. 

10 The strategy for detecting such genes comprises a number of innovations over those that 

have been used in previous work. 

The first part of the method is based on a search for particular RNAs that are overabundant 
in cancer cells. A first innovation of the method is to compare RNA abundance between control 
cells and several different cancer cells or cancer cell lines of the desired type. The cONA 

1 5 fragments that emerge in a greater amount in several different cancer lines, but not in control cells, 
are more likely to reflect genes that are important in disease progression, rather than those that 
have undergone secondary or coincidental activation. It is particularly preferred to use cancer cells 
that are known to share a common duplicated chromosomal region 

A second innovation of this method is to supply as control, not RNA from a cell line or 

20 culture, but from fresh tissue samples of non-malignant origin. There are two reasons for this. 
First, the tissue will provide the spectrum of expression that is typical to the normal cell phenotype. 
rather than individual differences that may become more prominent in culture. This establishes a 
more reliable baseline for normal expression levels. More importantly, the tissue will be devoid of 
the effects that in vitro culturing may have in altering or selecting particular phenotypes. For 

25 example, proto-oncogenes or growth factors may become up-regulated in culture. When cultured 
cells are used as the control for differential display, these up-regulated genes would be missed. 

A third innovation of this method is to undertake a subselection for cDNA corresponding to 
genes that achieve their RNA overabundance in a substantial proportion of cancer cells by gene 
duplication. To accomplish this, appropriate cDNA corresponding to overabundant RNA identified 

30 in the foregoing steps are used to probe digests of cellular DNA from a panel of different cancer 
cells, and from normal genomic DNA. cDNA that shows evidence of higher copy numbers in a 
proportion of the panel are selected for further characterization. An additional advantage of this 
step is that cONA conesponding to mitochondrial genes can rapidly be screened away by including 
a mitochondrial DNA digest as an additional sample for testing the probe. This eliminates most of 

35 the false-positive cDNA. which otherwise make up a majority of the cDNA identified. 

Thus, the identification of genes yielding products that are present at abnormal levels is 
accomplished by a method comprised of the following steps. 

To identify particular RNA that is overabundant in cancer cells. RNA is prepared from both 
cancerous and control cells by standard techniques. Cancer-associated genes may affect cellular 
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metabolism by any one of a number of mechanisms. For example, they may encode ribozymes 
anfa-sense polynucleotides, DNA-binding polynucleotides, altered ribosomal RNA and the like ' 
The gene screen.ng methods of this invention may employ a comparison of RNA abundance levels 
at the total RNA level, not strictly limited to mRNA. However, the vast majority of cancer 
associated genes are predicted to encode a protein gene whose up-regulation is closely linked to 
the metabolic process. For example, the four exemplary breast cancer genes described elsewhere 
-n this application all comprise an open reading frame. Accordingly, a focus on mRNA enriches the 
selectable pool for candidate cancer-associated genes. Focus towards mRNA can be conducted 
at any step in the method. It is particularly convenient to use a display method that displays cDNA 
copied only from mRNA In this case, whole RNA may be prepared and anafyzed from cancer and 
control cell populations without separating out mRNA. 

In terms of the cancer cells used as an RNA source, it is particularly advantageous to use 
a plurality of cancer cells known to contain a duplicated gene or chromosomal segment in the same 
region of the chromosome. The duplicated segment need not be the same size in all the cells nor 
is it necessary that the number of duplications be the same, so long as there is at least some part 
of the duplicated segment that is shared amongst all the cancer cells used in the screen Thus a 
minimum of two. and preferably at least three cancer cells are used that are sufficiently 
characterized to identify a shared duplicated region, and can be used as a source of RNA for the 
screening test. In contrast, the control cell population will not comprise chromosomal duplications. 

Assuming the duplication to be related to the malignancy of the cancer cells. RNA 
transcribed from the duplicated region is expected to be overabundant compared with that of the 
control cell. Accordingly, a highly effective strategy is to identify overabundant RNA that is present 
in all (or at least several) of the cancer cell preparations, but none of the control preparations. By 
using cancer cells that share a duplicated chromosomal region, the RNA comparison will be 
strongly biased in favor of RNA overabundance transcribed from the shared duplicated region. 
Since the shared region is optimally only a small segment of a single chromosome, expression 
differences arising from elsewhere in the genome in one cancer cell or another will not be selected 
We have found that this is highly effective in eliminating: a) RNA abundance differences resulting 
from normal metabolic variations between cells; and/or b) RNA abundance differences related to 
cancer cell malignancy, but occurring secondarily to malignant transformation. This is important 
because it considerably minimizes the chief deficiency in the use of RNA comparison methods 
particularly differential display, for the screening of potential cancer genes: namely, the onerous 
number of false-positives that such techniques generate. 

Shared duplicated regions in cancer cells may be identified by a relevant analytical 
technique, or by reference to such analysis already conducted and published. One approach that 
has been highly effective in mapping approximate sutw:hromosomal locations of duplicated 
segments is comparative genomic hybridization (CGH) This technique involves extracting, 
amplifying and labeling DNA from the subject cell; hybridizing to reference metaphase 
chromosomes treated to remove repetitive sequences; and observing the position of the hybridized 
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DNA on the chromosomes (WO 93/18186; Gray et al). The greater the signal intensity at a given 
position, the greater the copy number of the sequences in the subject cell. Thus, regions showing 
elevated staining correspond to genes duplicated in the cancer cells, while regions showing 
diminished staining correspond to genes deleted in the cancer cells. Related techniques which a 
practitioner in the art will be well aware are methods for preparing and using repeat sequence 
chromosome-specific nucleic acid probes (US 5.427.932; Weier et al ). methods for staining target 
chromosomal DNA using labeled nucleic acid fragments in conjunction with blocking fragments 
complementary to repetitive DNA segments (US 5.447.841; Gray et al ). and methods for detecting 
amplified or deleted chromosomal regions using a mapped library of labeled polynucleotide probes 
(US 5.472,842; Stokke et a..). If desired, multiple fluorochromes can be used as labeling agents 
with CGH and related techniques, to provide a three-color visualization of deleted, normal, and 
duplicated chromosome abnormalities (Lucas et al.). 

The choice of a particular chromosomal mapping approach is irrelevant, especially once 
knowledge of the duplicated region is known. If the location of the chromosome duplication is 
already established for a cel. line to be used in RNA comparison during the course of the present 
■nvenfion. then it is unnecessary to conduct a mapping technique de novo. For example 
established cancer cell lines exist for which mapping data is already available in the public domain' 
Provided in the reference section of this application is a list of over 40 articles in which the 
locations of duplicated regions in particular cancer cells are described. In the context of the 
present invention, a plurality of cancer cells is chosen for the screening panel based on such data 
so that they share a duplicated chromosomal region. The chromosomal location of a suspected 
duplication may be confirmed by hybridization analysis, if desired, using a probe specific for the 
location. 

The cancer cells used for RNA comparison are also generally (but not necessarily) derived 
from the same type of cancer or the same tissue. Using cells derived from the same type of cancer 
increases the probability that the gene ultimately identified will be common in that type of cancer 
and suitable as a type-specific diagnostic marker. Using cells derived from different types of 
cancer is in effect a search for cancer-related genes that are less tissue specific and more related 
to the mafignant process in general. Both types of genes are of interest for both diagnostic and 
therapeutic purposes. In one illustration highlighted in Example 1. RNA was screened from the 
three breast cancer cel. lines BT474. SKBR3. and MCF7. which have been determined by CGH or 
Southern analysis to share a duplicated genetic regions in chromosomes 1. 8. 14. 17 and 20 
When the RNA from these cells was displayed, a number of RNA were found to be overabundant 
m the cancer cells, but not controls (Figure 1). Three RNA overabundant in all three cancer cel. 
fines corresponded to cancer-associated genes located on chromosomes 1. 8. and 14 that are 
listed in Table 1. The chromosome 13 gene (CH13-2a12-1) was overexpressed in 2 of the 3 cell 
tales; namely BT474 and SKBR3. Southern analysis subsequently established that the 
chromosome 13 gene was duplicated in the same two cell lines (Example 6. Table 5). 
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Selection of the source or sources of control cell RNA is also a matter of some refinement. 
The control RNA can be derived from in vitro cultures of non-malignant cells, or established cell 
lines derived from a non-malignant source. However, it is preferable for the control RNA to be 
obtained directly from normal human tissue of the same type as the cancer cells. This is because 
most normal cells do not proliferate indefinitely; hence adaptation of a cell into a cell line involves a 
degree of transformation. The transforming event may. in turn, be shared with that of certain 
cancer cells, at least at the level of RNA abundance. Hence, comparison of the RNA levels in 
cancer cells with so-called control cell lines may lead the practitioner to miss genes that are related 
to malignancy. For convenience, control cells may be maintained in culture for a brief period 
before the experiment, and even stimulated; however, multiple rounds of cell division are to be 
avoided if possible. Use of both stimulated and unstimulated cells as controls may help provide 
RNA patterns corresponding to the normal range of abundance within various metabolic events of 
the cell cycle. In one illustration highlighted in Example 1. RNA was screened using both 
proliferating and non-proliferating cells. As stated, the screening of breast cancer RNA is 
preferably conducted using uncultured normal mammary epithelial cells (termed 'organoids") as 
sources of control RNA. These cells may be obtained from surgical samples resected from healthy 
breast tissue. 

The RNA is preserved until use in the comparison experiment in such a way to minimize 
fragmentation. To facilitate confirmation experiments, it is useful to use RNA of a reproducible 
20 character. For this reason, it is convenient to use RNA that has been obtained from stable 
cancerous cell lines and/or ready tissue sources, although reproducibility can also be provided by 
preparing enough RNA so that it can be preserved in aliquots. 

For displaying relative overabundance of RNA in the cancer cells, compared with the 
control cells, many standard techniques are suitable. These would include any form of subtracts 
hybridization or comparative analysis. Preferred are techniques in which more than two RNA 
sources are compared at the same time, such as various types of arbitrarily primed PCR 
fingerprinting techniques (Welsh et al.. Yoshikawa et al). Particularly preferred are differential 
mRNA display methods and variations thereof, in which the samples are run in neighboring lanes in 
a separating gel. These techniques are focused towards mRNA by using primers that are specific 
30 for the poly-A tail characteristic of mRNA (Liang et al., 1992a; U.S. Patent 5.262.31 1 ). 

Because many thousands of genes are expressed in the cells of higher organisms at any 
one time, it is preferable to improve the legibility of the display by surveying only a subset of the 
RNA at a time. Methods for accomplishing this are known in the art. A preferred method is by 
using selective primers that initiate PCR replication for a subset of the RNA. Thus, the RNA is first 
reverse transcribed by standard techniques. Short primers are used for the selection, preferably 
chosen such that alternative primers used in a series of like assays can complete a comprehensive 
survey of the mRNA. 

In a preferred example, primers can be used for the 3' region of the mRNAs which have an 
oligo-dT sequence, followed by two other nucleotides (TiNM, where i * 11, N e {A.C.G}. and M e 
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{A.C.G.T}). Thus, 12 possible primers are required to complete the survey. A random or arbitrary 
primer of minimal length can then be used for replication towards what corresponds in the 
sequence to the 5" region of the mRNA. The optimal length for the random primer is about 10 
nucleotides. The product of the PCR reaction is labeled with a radioisotope, such as 35 S. The 
5 labeled cDNA is then separated by molecular weight, such as on a polyacrylamide sequencing gel. 

If desired, variations on the differential display technique may be employed. For example, 
one-base oligo-dT primers may be used (Liang et al., 1993 & 1994), although this is generally less 
preferred because the display pattern is correspondingly more complex. Selection of primers may 
be optimized mathematically depending on the number of RNA species in a tissue of interest 

10 (Bauer et al ). The method may be adapted for non-denaturing gels, and for use with automatic 
DNA sequencers (Bauer et aL). Alternative radioisotopes (Trentmann et al.) or fluorochromes (Sun 
et al.) may be used for labeling the differential display. Differential display may optionally be 
combined with a ribonuclease protection assay (Yeatman et al.). PCR primers may optionally 
incorporate a restriction site to facilitate cloning (Linskens et al., Ayala et al ). Using Taq 

15 polymerase from multiple manufacturers can increase the amount of variation under otherwise 
identical conditions (Haag et al ). Nested PCR primers may be used in differential display to 
decrease background created by oligo-dT primers (WO 95/33760). Other variants of the 
differential display technique are known in the art and described inter alia in the references cited in 
this disclosure. The use of such modifications are within the scope of the present invention, but are 

20 not required, as evidenced by the examples described below. 

Based on the comparison of relative abundance of RNA, particular RNAs are chosen which 
are present as a higher proportion of the RNA in cancerous cells, compared with control cells. 
When using the differential display method, the cDNA corresponding to overabundant RNA will 
produce a band with greater proportional intensity amongst neighboring cDNA bands, compared 

25 with the proportional intensity in the control lanes. Desired cDNAs can be recovered most directly 
by cutting the spot in the gel corresponding to the band, and recovering the DNAs therefrom. 
Recovered cDNA can be replicated again for further use by any technique or combination of 
techniques known in the art, including PCR and cioning into a suitable carrier. 

An optional but highly beneficial additional screening step, typically performed 

30 subsequently to an RNA comparison as described above, is aimed at identifying genes that are 
duplicated in a substantial proportion of cancers. This is conducted by using cDNA such as 
selected from differential display to probe digests of chromosomal DNA obtained from two or more 
cancerous cells, such as cancer cell lines. Chromosomal DNA from non-cancerous cells that 
essentially reflects the germ line in terms of gene copy number is used for the control. A preferred 

35 source of control DNA in experiments for human cancer genes is placental DNA, which is readily 
obtainable. The DNA samples are cleaved at sequence-specific sites along the chromosome, most 
usually with a suitable restriction enzyme into fragments of appropriate size. The DNA can be 
blotted directly onto a suitable medium, or separated on an agarose gel before blotting. The latter 
method is preferred, because it enables a comparison of the hybridizing chromosomal restriction 
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fragment .o define whether the probe is bidding to me sam . (raomen , in „ 
atnoonto, p,^ b^ 10 DNA digests boa, eaoh or the cancer ^ „ am 
binding to control DNA. amount 

Because the oonrparison is ooanbtathre. is preferable ,e standardize the measurement 
» ™*™», One method is ,„ administer a second pmbe to „e m ^ fo , . ^ 

d_* . titrely ,o be duplicated ,„ ^ cancer ce»s. This method is preferred, because 

.standards M on,, ,or dafesences In ,ha anaoan. „ DNA provkted. ba, also ,or differences ia 
e amaaa, t^ed .ariag bfobb*. Th„ caa be accented b y using altemaWe laMs ,„ 

,0 " ^ "" fi ' S ' "* " SU " aBle elUaW ^barastering the 

To eliminate cONA for mitochondna, geoes. it ia preferable to indada ia a pataltel analusis 
. ntttochondba, ona prepared digested * the same _ enayme. Any cONA probed 
hVbndizes to *e appmpnate mitochondda ,.smc«„n fragments oa„ te sus( L,ed * 
corresponding to a mitochondrial gene. 
15 ,n the initial reptication of the RNA. the random primer may bind at any .ocation along the 

RNA sequence. Thus, the copied and rep,icated segment may be a fragment of the futt-tength 
RNA. Longer cONA corresponding to a greater portion of the sequence can be obtained if 
des.red. by severa, techniques known to practitioners of ordinary ski, These indude using the 
DNA fragment to isotate the corresponding RNA. or to iso.ate compiementary DNA from a cONA 
..brary of the same species. Preferably, the .ibrary is derived from the same tissue source and 
more preferably from . cancer cel. Hne of the same type. For example, for cDNA corresponding to 
human breast cancer genes, a preferred .ibrary is derived from breast cancer eel, ,ne BT<74 
constructed in lambda GT1 0. 

sample to commercial sequencing services. The chromosomal .ocations of the genes can be 
determined by any one of severa. methods known in the art, such as in situ hybridation using 
chromosomal smears, or panels of somatic ce.l hybrids of known chromosomal composition 

The cONA obtained through the selection process outiined can then be tested against a 
arger pane, of cancer ce.. tines and/or fresh tumor ce..s to determine what proportion of the cells 
>0 have duplicated the gene. This can be accomplished by using the cDNA as a probe for 
chromosoma. DNA digests, as described eariier. As illustrated in the Examp.e section, a preferred 
method for conducting this determination is Southern analysis. 

The cDNA can also be used to determine what proportion of the cells have RNA 
overabundance. This can be accomplished by standard techniques, such as slot blots or blots of 
agarose gels, using whole RNA or messenger RNA from each, of the celts in the pane. The b.ots 
are then probed with the cDNA using standard techniques. It is preferable to provide an interna, 
loadmg and blotting contro. for this ana.ysis. A preferred method is to re-probe the same blot for 
transenpts of a gene likely to be present in about the same level in alt celts of the same type such 
as the gene for a cytoskeletal protein. Thus, a preferred second probe is the cDNA for beta-actin 
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Using a novel cDNA found by this selection procedure, it is anticipated that essentially all 
cancer cells showing gene duplication will also show RNA overabundance, but that some will show 
RNA overabundance without gene duplication. 

The practitioner will readily appreciate that the strategies for identifying genes that are 
5 duplicated and/or associated with RNA overabundance may be reversed appropriately to screen 
for genes that are deleted and/or associated with RNA underabundance. The principles are 
essentially the same. Genes that are frequently down-regulated in cancer (such as tumor 
suppresser genes) may be down-regulated by different mechanisms in different cells, and a gene 
with this behavior is more likely to be central to malignant transformation or persistence of the 
10 malignant state. 

To screen for such down-regulated genes according to the present invention. RNA is 
prepared from a plurality of tumors or cancer cell lines and the abundance is compared with RNA 
preparation from control cells. Again, it is highly preferable to use cancer cells that share a deleted 
gene in the same chromosomal region, in order to focus any differences at the RNA level towards 
particular alterations in cancer cells and away from normal variations or coincidental changes. The 
CGH technique may be used to identify deletions in previously uncharacterized cancer cells. As 
before, cancer cells may be chosen on the basis of previous knowledge of deleted regions; there is 
no need to conduct methods such as CGH on previously characterized lines. cDNA from the RNA 
of cancer cells is displayed (preferably by differential display) alongside cDNA copied from 
(preferably uncultured) control cells, and cONA is selected that appears to be underrepresented in 
at least two (preferably more) of the cancer cells compared with the control cells. cDNA thus 
selected may optionally be further screened against digested DNA preparations, to confirm that the 
RNA underabundance observed in the cancer cell populations is attributable in at least a proportion 
of the cells to an actual gene deletion. 

As before, the cDNA may be used for sequencing or rescuing additional polynucleotides, in 
this case not from the cancer cells but from cells containing or expressing the gene at normal 
levels. Pharmaceuticals based on deleted genes or those associated with underexpressed RNA 
are typically oriented at restoring or upregulating the gene, or a functional equivalent of the 
encoded gene product. 
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To identify particular RNA that is overabundant in cancer cells. RNA has been compared 
between breast cancer cells and control cells. The amount of total cellular RNA was compared using 
a modified differential display method. Primers were used for the 3' region of the mRNAs which have 
an oligo-dT sequence, followed by two other nucleotides as described in the previous section. 
Random or arbitrary primers of about 10 nucleotides were used for replication towards what 
corresponds in the sequence to the 5" region of the mRNA The labeled amplification product was 
then separated by molecular weight on a polyacrylamide sequencing gel. 
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Parser nrRNAs were cnosen M were preset to a hioher proportion o f *. rna in 
cancerous ce,,s. compare, „* coo.ro, ce»s. acco^iog „ , he prop0 « na , ^ 

CONA oan*. The cONA was „ *ec, aon, ,ne „ w J£ ^ ' 

Northern ^ southern ana^s* ,o oeternune . «ve cooespoootng penes were uuptJL^ 

respond br to RNA ouenauunUance „ Ureas, cancer oe„, Sequence oata « *. poj "le" 

was ouuaine. a„rt co^eo * seooeocos tn ^ Nove , po^es 

— n pa»ents were usert ,o p™* „ ^ 9 er cONA urserts h a W o Kcrarv „oT 

»ac«aslcancerce«,ineeT47a.»*ichweremensequen^ 

Further oescpeon o, ,he acuar expend, events M occun-eo oun„ 9 «e*ca fo „ „, „ 
~ pfa „ penes, anrt sea.uenc. da B ,or CH.-aa,,-,. CH3-2a,3., «»**.,. M JZ 
2a 1 6-1 are provided in the Example section. 

Preparation of polynucleotides, polypeptides and antibodies 

Polynucleotides based on the cONA of CH1-9a11-2. CH8-2a13-1. CH13-2a12-1 CH14 
2316-1. can be rescued from doned plasmids and phage provided as pah of this .nvention They 
may also be obtained from breast cancer ce,. libraries or rnRNA preparations, or from norma, huma 

el 5 T " TT ^ jUdid ° US ° f ~ Probesba — ese.uencedataprov.ed 
re,n. Alternate, the sequence data proved herein can be used in chemica, syndesis to 
produce a polynucleotide with an identical sequence, or that incorporates occasiona, variations 

method PO !T eS enC06 " 1 ^ ^ COrreSPOndin9 ^ - ^ ^ b V -era, dtfferent 
methods, al, of wh,ch « be known to a practitioner of ordinary ski... For example, the appropriate 
strand of the fu.Mength cONA can be operab,y .inked to a suitab,e promote, and transfected into a 

suita le host ce, The host cel. is then cu.tured under conditions that a,.ow transcription and 
tran sb * on to occu , and the p0|ypeptjde , ^ ^ 

determ.ne the po.ynuc.eo.de sequence of the cONA. and predict the polype sequence according 
to *e genetic code. A potypeptide can then be prepared directiy. for example, by chemica. synthesis 
e.ther .dentical to the predicted sequence, or incorporating occasional variations 

Antibodies against po.ypeptides of this invention may be prepared by any method known in 
^e art. For stimulating antibody production in an anima.. it is often preferable to enhance the 
-mmunogenicity of a polypeptide by such techniques as polymerization with glutara.dehyde or 
combining with an adjuvant such as Freund's adjuvant. The immunogen is injected into a suitable 
expenmenta. animal: oreferabfy a rodent for the preparation of monoclonal antibodies; preferably a 
lamer anima. such as a rabbft or sheep for preparation of po.yc.ona. antibodies. « is preferable to 
prov.de a second or booster injection after about 4 weeks, and begin harvesting the antibody source 
no less than about 1 week later. 

Sera harvested from the immunized animals provide a source of polyclonal antibodies 
Deta,«ed procedures for purifying specific antibody activity from a source materia, are known within the 
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art. Unwanted activity cross-reacting with other antigens, if present, can be removed, for example, by 
running the preparation over adsorbants made of those antigens attached to a solid phase, and 
collecting the unbound fraction. If desired, the specific antibody activity can be further purified by such 
techniques as protein A chromatography, ammonium sulfate precipitation, ion exchange 
5 chromatography, high-performance liquid chromatography and immunoaffinity chromatography on a 
column of the immunizing polypeptide coupled to a solid support. 

Alternatively, immune cells such as splenocytes can be recovered from the immunized 
animals and used to prepare a monoclonal antibody-producing cell line. See. for example. Harrow & 
Lane (1988). U.S. Patent Nos. 4.491.632 (J.R. Wands et al). U.S. 4.472,500 (C. Milstein et al). and 
10 U.S. 4,444,887 (M.K.Hoffman etal.) 

Briefly, an antibody-producing line can be produced inter alia by cell fusion, or by transfecting 
antibody-producing cells with Epstein Barr Virus, or transforming with oncogenic DNA. The treated 
cells are cloned and cultured, and clones are selected that produce antibody of the desired specificity. 
Specificity testing can be performed on culture supematants by a number of techniques, such as 
using the immunizing polypeptide as the detecting reagent in a standard immunoassay, or using cells 
expressing the polypeptide in immunohistochemistry. A supply of monoclonal antibody from the 
selected clones can be purified from a large volume of tissue culture supernatant, or from the ascites 
fluid of suitably prepared host animals injected with the clone. 

Effective variations of this method include those in which the immunization with the 
polypeptide is performed on isolated cells. Antibody fragments and other derivatives can be prepared 
by methods of standard protein chemistry, such as subjecting the antibody to cleavage with a 
proteolytic enzyme. Genetically engineered variants of the antibody can be produced by obtaining a 
polynucleotide encoding the antibody, and applying the general methods of molecular biology to 
introduce mutations and translate the variant 
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Use in diagnosis 



Novel cDNA sequences corresponding to genes associated with cancer are potentially useful 
as diagnostic aids. Similarly, polypeptides encoded by such genes, and antibodies specific for these 
30 polypeptides, are also potentially useful as diagnostic aids. 

More specifically, gene duplication or overabundance of RNA in particular cells can help 
identify those cells as being cancerous, and thereby play a part in the initial diagnosis. Increased 
levels of RNA corresponding to CH1-9a11-2. CH8-2a13-12. CH13-2a12-1. and CH14-2a16-1 are 
present in a substantial proportion of breast cancer cell lines and primary breast tumors. In addition, 
preliminary Northern analysis using probes for CH8-2a13-12, CH13-2a12-1. and CH14-2a16-1 
indicates that these genes may be duplicated or be associated with RNA overabundance in certain 
cell lines derived from cancers other than breast cancer, including colon cancer, lung cancer, 
prostrate cancer, glioma, and ovarian cancer. 
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For patients already diagnosed with cancer, gene duplication or overabundance of RNA can 
assist with clinical management and prognosis. For example, overabundance of RNA may be a 
useful predictor of disease survival, metastasis, susceptibility to various regimens of standard 
chemotherapy, the stage of the cancer, or its aggressiveness. See generally the article by Blast, U.S. 
5 Patent No. 4,968.603 (Slamon et al.) and PCT Application WO 94/00601 (Levine et al). All of these 
determinations are important in helping the clinician choose between the available treatment options. 

A particularly important diagnostic application contemplated in this invention is the 
identification of patients suitable for gene-specific therapy, as outlined in the following section. For 
example, treatment directed against a particular gene or gene product is appropriate in cancers where 

10 the gene is duplicated or there is RNA overabundance. Given a particular pharmaceutical that is 
directed at a particular gene, a diagnostic test specific for the same gene is important in selecting 
patients likely to benefit from the pharmaceutical. Given a selection of such pharmaceuticals specific 
for different genes, diagnostic tests for each gene are important in selecting which pharmaceutical is 
likely to benefit a particular patient. 

15 The polynucleotide, polypeptide, and antibodies embodied in this invention provide specific 

reagents that can be used in standard diagnostic procedures. The actual procedures for conducting 
diagnostic tests are extensively known in the art, and are routine for a practitioner of ordinary skill. 
See, for example, U.S. Patent No. 4,968,603 (Slamon et al ), and PCT Applications WO 94/00601 
(Levine et al.) and WO 94/17414 (K Keyomarsi et al.). What follows is a brief non-limiting survey of 

20 some of the known procedures that can be applied. 

Generally, to perform a diagnostic method of this invention, one of the compositions of this 
invention is provided as a reagent to detect a target in a clinical sample with which it reacts. Thus, the 
polynucleotide of this invention can be used as a reagent to detect a DNA or RNA target, such as 
might be present in a cell with duplication or RNA overabundance of the corresponding gene. The 

25 polypeptide can be used as a reagent to detect a target for which it has a specific binding site, such as 
an antibody molecule or (if the polypeptide is a receptor) the corresponding ligand. The antibody can 
be used as a reagent to detect a target it specifically recognizes, such as the polypeptide used as an 
irnmunogen to raise it. 

The target is supplied by obtaining a suitable tissue sample from an individual for whom the 
30 diagnostic parameter is to be measured. Relevant test samples are those obtained from individuals 
suspected of containing cancerous cells, particularly breast cancer cells. Many types of samples are 
suitable for this purpose, including those that are obtained near the suspected tumor site by biopsy or 
surgical dissection, in vitro cultures of cells derived therefrom, blood, and blood components. If 
desired, the target may be partially purified from the sample or amplified before the assay is 
35 conducted. The reaction is performed by contacting the reagent with the sample under conditions that 
will allow a complex to form between the reagent and the target. The reaction may be performed in 
solution, or on a solid tissue sample, for example, using histology sections. The formation of the 
complex is detected by a number of techniques known in the art. For example, the reagent may be 
supplied with a label and unreacted reagent may be removed from the complex; the amount of 
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remaining label thereby indicating the amount of complex formed. Further details and alternatives for 
complex detection are provided in the descriptions that follow. 

To determine whether the amount of complex formed is representative of cancerous or non- 
cancerous cells, the assay result is compared with a similar assay conducted on a control sample. It 
5 is generally preferable to use a control sample which is from a non-cancerous source, and otherwise 
similar in composition to the clinical sample being tested. However, any control sample may be 
suitable provided the relative amount of target in the control is known or can be used for comparative 
purposes. Where the assay is being conducted on tissue sections, suitable control cells with normal 
histopathology may surround the cancerous cells being tested. It is often preferable to conduct the 

10 assay on the test sample and the control sample simultaneously. However, if the amount of complex 
formed is quantifiable and sufficiently consistent, it is acceptable to assay the test sample and control 
sample on different days or in different laboratories. 

A polynucleotide embodied in this invention can be used as a reagent for determining gene 
duplication or RNA overabundance that may be present in a dinical sample. The binding of the 

15 reagent polynucleotide to a target in a clinical sample generally relies in part on a hybridization 
reaction between a region of the polynucleotide reagent, and the DNA or RNA in a sample being 
tested. 

If desired, the nucleic acid may be extracted from the sample, and may also be partially 
purified. To measure gene duplication, the preparation is preferably enriched for chromosomal DNA; 

20 to measure RNA overabundance, the preparation is preferably enriched for RNA. The target 
polynucleotide can be optionally subjected to any combination of additional treatments, including 
digestion with restriction endonucleases, size separation, for example by electrophoresis in agarose 
or polyacrylamide, and affixed to a reaction matrix, such as a blotting material. 

Hybridization is allowed to occur by mixing the reagent polynucleotide with a sample 

25 suspected of containing a target polynucleotide under appropriate reaction conditions. This may be 
followed by washing or separation to remove unreacted reagent. Generally, both the target 
polynucleotide and the reagent must be at least partly equilibrated into the single-stranded form in 
order for complementary sequences to hybridize efficiently. Thus, it may be useful (particularly in 
tests for DNA) to prepare the sample by standard denaturation techniques known in the art. 

30 The minimum complementarity between the reagent sequence and the target sequence for a 

complex to form depends on the conditions under which the complex-forming reaction is allowed to 
occur. Such conditions include temperature, ionic strength, time of incubation, the presence of 
additional solutes in the reaction mixture such as fbrmamide, and washing procedure. Higher 
stringency conditions are those under which higher minimum complementarity is required for stable 

35 hybridization to occur It is generally preferable in diagnostic applications to increase the specificity of 
the reaction, minimizing cross-reactivity of the reagent polynucleotide alternative undesired 
hybridization sites in the sample. Thus, it is preferable to conduct the reaction under conditions of 
high stringency: for example, in the presence of high temperature, low salt, formamide, a combination 
of these, or followed by a low-salt wash. 
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In order to detect the complexes formed between the reagent and the target, the reagent is 
generally provided with a label. Some of the labels often used in this type of assay include 
radioisotopes such as 32 P and M P. chemiluminescent or fluorescent reagents such as fluorescein, and 
enzymes such as alkaline phosphatase that are capable of producing a colored solute or precipitant. 
The label may be intrinsic to the reagent, it may be attached by direct chemical linkage, or it may be 
connected through a series of intermediate reactive molecules, such as a biotin-avidin complex, or a 
series of inter-reactive polynucleotides. The label may be added to the reagent before hybridization 
with the target polynucleotide, or afterwards. 

To improve the sensitivity of the assay, it is often desirable to increase the signal ensuing 
from hybridization. This can be accomplished by replicating either the target polynucleotide or the 
reagent polynucleotide, such as by a polymerase chain reaction. Alternatively, a combination of 
serially hybridizing polynucleotides or branched polynucleotides can be used in such a way that 
multiple label components become incorporated into each complex. See U.S. Patent No. 5.124,246 
(Urdea et at.). 

An antibody embodied in this invention can also be used as a reagent in cancer diagnosis, or 
for determining gene duplication or RNA overabundance that may be present in a clinical sample. 
This relies on the fact that overabundance of RNA in affected cells is often associated with increased 
production of the corresponding polypeptide. Several of the genes up-regulated in cancer cells 
encode for cell surface receptors A for example. eroB-2. c-myc and epidermal growth factor. 
Alternatively, the RNA may encode a protein kept inside the cell, or it may encode a protein secreted 
by the cell into the surrounding milieu 

Any such protein product can be detected in solid tissue samples and cultured cells by 
immunohistological techniques that will be obvious to a practitioner of ordinary skill. Generally, the 
tissue is preserved by a combination of techniques which may include cooling, exchanging into 
different solvents, fixing with agents such as paraformaldehyde, or embedding in a commercially 
available medium such as paraffin or OCT. A section of the sample is suitably prepared and overlaid 
with a primary antibody specific for the protein. 

The primary antibody may be provided directly with a suitable label. More frequently, the 
primary antibody is detected using one of a number of developing reagents which are easily produced 
or available commercially. Typically, these developing reagents are anti-immunoglobulin or protein A. 
and they typically bear labels which include, but are not limited to. fluorescent markers such as 
fluorescein, enzymes such as peroxidase that are capable of precipitating a suitable chemical 
compound, electron dense markers such as colloidal gold, or radioisotopes such as ,2S I. The section 
is then visualized using an appropriate microscopic technique, and the level of labeling is compared 
between the suspected cancer cell and a control cell, such as cells surrounding the tumor area or 
those taken from an alternative site. 

The amount of protein corresponding to the cancer-associated gene may be detected in a 
standard quantitative immunoassay. If the protein is secreted or shed from the cell in any appreciable 
amount, it may be detectable in plasma or serum samples. Alternatively, the target protein may be 
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solubilized or extracted from a solid tissue sample. Before quantifying, the protein may optionally be 
affixed to a solid phase, such as by a blot technique or using a capture antibody. 

A number of immunoassay methods are established in the art for performing the quantitation. 
For example, the protein may be mixed with a pre-determined non-limiting amount of the reagent 
5 antibody specific for the protein. The reagent antibody may contain a directly attached label, such as 
an enzyme or a radioisotope, or a second labeled reagent may be added, such as 
anti-immunoglobulin or protein A. For a solid-phase assay, unreacted reagents are removed by 
washing. For a liquid-phase assay, unreacted reagents are removed by some other separation 
technique, such as filtration or chromatography. The amount of label captured in the complex is 

10 positively related to the amount of target protein present in the test sample. A variation of this 
technique is a competitive assay, in which the target protein competes with a labeled analog for 
binding sites on the specific antibody. In this case, the amount of label captured is negatively related 
to the amount of target protein present in a test sample. Results obtained using any such assay on a 
sample from a suspected cancer-bearing source are compared with those from a noncancerous 

1 5 source. 

A polypeptide embodied in this invention can also be used as a reagent in cancer diagnosis, 
or for determining gene duplication or RNA overabundance that may be present in a clinical sample. 
Overabundance of RNA in affected cells may result in the corresponding polypeptide being produced 
by the cells in an abnormal amount. On occasion, overabundance of RNA may occur concurrently 
with expression of the polypeptide in an unusual form. This in turn may result in stimulation of the 
immune response of the host to produce its own antibody molecules that are specific for the 
polypeptide. Thus, a number of human hybridomas have been raised from cancer patients that 
produce antibodies against their own tumor antigens. 

To use the polypeptide in the detection of such antibodies in a subject suspected of having 
cancer, an immunoassay is conducted. Suitable methods are generally the same as the 
immunoassays outlined in the preceding paragraphs, except that the polypeptide is provided as a 
reagent, and the antibody is the target in the clinical sample which is to be quantified. For example, 
human IgG antibody molecules present in a serum sample may be captured with solid-phase protein 
A. and then overlaid with the labeled polypeptide reagent The amount of antibody would then be 
proportional to the label attached to the solid phase. Alternatively, cells or tissue sections expressing 
the polypeptide may be overlaid first with the test sample containing the antibody, and then with a 
detecting reagent such as labeled anti-immunoglobulin. The amount of antibody would then be 
proportional to the label attached to the cells. The amount of antibody detected in the sample from a 
suspected cancerous source would be compared with the amount detected in a control sample. 

These diagnostic procedures may be performed by diagnostic laboratories, experimental 
laboratories, practitioners, or private individuals. This invention provides diagnostic kits which can be 
used in these settings. The presence of cancer cells in the individual may be manifest in a clinical 
sample obtained from that individual as an alteration in the DNA RNA, protein, or antibodies 
contained in the sample. An alteration in one of these components resulting from the presence of 
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cancer may take the form of an increase or decrease of the level of the component, or an alteration in 
the form of the component, compared with that in a sample from a healthy individual. The clinical 
sample is optionally pre-treated for enrichment of the target being tested for. The user then applies a 
reagent contained in the kit in order to detect the changed level or ateration in the diagnostic 
component. 

Each kit necessarily comprises the reagent which renders the procedure specific- a reagent 
polynucleotide, used for detecting target DNA or RNA; a reagent antibody, used for detecting target 
protein; or a reagent polypeptide, used for detecting target antibody that may be present in a sample 
to be analyzed. The reagent is supplied in a solid form or liquid buffer that is suitable for inventory 
storage, and later for exchange or addition into the reaction medium when the test is performed 
Suitable packaging is provided. The kit may optionally provide additional components that are useful 
in the procedure. These optional components include buffers, capture reagents, developing reagents 
labels, reacting surfaces, means for detection, control samples, instructions, and interpretive 
information. 

Use in pharmaceutical development 
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Embodied in this invention are modes of treating subjects bearing cancer cells that have 
overabundance of the particular RNA described. The strategy used to obtain the cDNAs provided in 
this invention was deliberately focused on genes that achieve RNA overabundance by gene 
duplication in some cells, and by alternative mechanisms in other cells. These alternative 
mechanisms may include, for example, translocation or enhancement of transcription enhancing 
elements near the coding region of the gene, deletion of repressor binding sites, or altered production 
of gene regulators. Such mechanisms would result in more RNA being transcribed from the same 
gene. Alternatively, the same amount of RNA may be transcribed, but may persist longer in me cell, 
resulting in greater abundance. This could occur, for example, by reduction in the level of ribozymes 
or protein enzymes that degrade RNA. or in the modification of the RNA to render it more resistant to 
such enzymes or spontaneous degradation. 

Thus, different cells make use of at least two different mechanisms to achieve a single result 
A the overabundance of a particular RNA. This suggests that RNA overabundance of these genes is 
central to the cancer process in the affected cells. Interfering with the specific gene or gene product 
would consequently modify the cancer process. It is an objective of this invention to provide 
pharmaceutical compositions that enable therapy of this kind. 

One way this invention achieves this objective is through screening candidate drugs. The 
general screening strategy is to apply the candidate to a manifestation of a gene associated with 
cancer, and then determine whether the effect is beneficial and specific. For example, a composition 
that interferes with a polynucleotide or polypeptide corresponding any of the novel cancer-associated 
genes described herein has the potential to block the associated pathology when administered to a 
tumor of the appropriate phenotype. It is not necessary that the mechanism of interference be known 
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only that the interference be preferential for cancerous cells (or cells near the cancer site) but not 
other cells. 

A preferred method of screening is to provide cells in which a polynucleotide related to a 
cancer gene has been transfected. See, for example, PCT application WO 93/08701. A practitioner 
of ordinary skill will be well acquainted with techniques for transfecting eukaryotic cells, including the 
preparation of a suitable vector, such as a viral vector; conveying the vector into the cell, such as by 
electroporation; and selecting cells that have been transformed, such as by using a reporter or drug 
sensitivity element 

A cell line is chosen which has a phenotype desirable in testing, and which can be maintained 
well in culture. The cell line is transfected with a polynucleotide corresponding to one of the 
cancer-associated genes identified herein. Transfection is performed such that the polynucleotide is 
operably linked to a genetic controlling element that permits the correct strand of the polynucleotide to 
be transcribed within the cell. Successful transfection can be determined by the increased abundance 
of the RNA compared with an untransfected cell. It is not necessary that the cell previously be devoid 
of the RNA. only that the transfection result in a substantial increase in the level observed RNA 
abundance in the cell is measured using the same polynucleotide, according to the hybridization 
assays outlined earlier. 

Drug screening is performed by adding each candidate to a sample of transfected cells, and 
monitoring the effect. The experiment includes a parallel sample which does not receive the 
candidate drug. The treated and untreated cells are then compared by any suitable phenotypic 
criteria, including but not limited to microscopic analysis, viability testing, ability to replicate, 
histological examination, the level of a particular RNA or polypeptide associated with the cells, the 
level of enzymatic activity expressed by the cells or cell lysates. and the ability of the cells to interact 
with other cells or compounds. Differences between treated and untreated cells indicates effects 
attributable to the candidate. In a preferred method, the effect of the drug on the cell transfected with 
the polynucleotide is also compared with the effect on a control cell. Suitable control cells include 
untransfected cells of similar ancestry, cells transfected with an alternative polynucleotide, or cells 
transfected with the same polynucleotide in an inoperative fashion. Optimally, the drug has a greater 
effect on operably transfected cells than on control cells. 

Desirable effects of a candidate drug include an effect on any phenotype that was conferred 
by transfection of the cell line with the polynucleotide from the cancer-associated gene, or an effect 
that could limit a pathological feature of the gene in a cancerous cell. Examples of the first type would 
be a drug that limits the overabundance of RNA in the transfected cell, limits production of the 
encoded protein, or limits the functional effect of the protein. The effect of the drug would be apparent 
when comparing results between treated and untreated cells. An example of the second type would 
be a drug that makes use of the transfected gene or a gene product to specificaMy poison the cell. 
The effect of the drug would be apparent when comparing results between operably transfected cells 
and control cells. 
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This invention also provides gene-specific pharmaceuticals in which each of the 
polynucleotides, polypeptides, and antibodies embodied herein as a specific active ingredient in 
5 pharmaceutical compositions. Such compositions may decrease the pathology of cancer cells on 
their own, or render the cancer cells more susceptible to treatment by the non-specific agents, such 
as classical chemotherapy or radiation. 

An example of how polynucleotides embodied in this invention can be effectively used in 
treatment is gene therapy. See, for example, Morgan et al., Culver et aL, and U.S. Patent No. 

10 5,399,346 (French et al.). The general principle is to introduce the polynucleotide into a cancer cell in 
a patient, and allow it to interfere with the expression of the corresponding gene, such as by 
complexing with the gene itself or with the RNA transcribed from the gene. Entry into the cell is 
facilitated by suitable techniques known in the art as providing the polynucleotide in the form of a 
suitable vector, or encapsulation of the polynucleotide in a liposome. The polynucleotide may be 

15 provided to the cancer site by an antigervspecific homing mechanism, or by direct injection. 

A preferred mode of gene therapy is to provide the polynucleotide in such a way that it will 
replicate inside the cell, enhancing and prolonging the interference effect. Thus, the polynucleotide is 
operably linked to a suitable promoter, such as the natural promoter of the corresponding gene, a 
heterologous promoter that is intrinsically active in cancer cells, or a heterologous promoter that can 

20 be induced by a suitable agent Preferably, the construct is designed so that the polynucleotide 
sequence operably linked to the promoter is complementary to the sequence of the corresponding 
gene. Thus, once integrated into the cellular genome, the transcript of the administered 
polynucleotide will be complementary to the transcript of the gene, and capable of hybridizing with it. 
This approach is known as anti-sense therapy. See, for example, Culver et al. and Roth. 

25 The use of antibodies embodied in this invention in the treatment of cancer partly relies on the 

fact that genes that show RNA overabundance in cancer frequently encode ce^surface proteins. 
Location of these proteins at the cell surface may correspond to an important biological function of the 
cancer cell, such as their interaction with other cells, the modulation of other cell-surface proteins, or 
triggering by an incoming cytokine. 

30 These mechanisms suggest a variety of ways in which a specific antibody may be effective in 

decreasing the pathology of a cancer cell. For example, if the gene encodes for a growth receptor, 
then an antibody that blocks the ligand binding site or causes endocytosis of the receptor would 
decrease the ability of the receptor to provide its signal to the cell. It is unnecessary to have 
knowledge of the mechanism beforehand; the effectiveness of a particular antibody can be predicted 

35 empirically by testing with cultured cancer cells expressing the corresponding protein. Monoclonal 
antibodies may be more effective in this form of cancer therapy if several different clones directed at 
different determinants of the same cancer-associate gene product are used in combination: see PCT 
application WO 94/00136 (Kasprzyk et al.). Such antibody treatment may directly decrease the 
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pathology of the cancer cells, or render them more susceptible to non-specific cytotoxic agents such 
as platinum (Lippman). 

Another example of how antibodies can be used in cancer therapy is in the specific targeting 
of effector components. The protein product of the cancer-associated gene is expected to appear in 
high frequency on cancer cells compared to unaffected cells, due. to the overabundance of the 
corresponding RNA. The protein therefore provides a marker for cancer cells that a specific antibody 
can bind to. An effector component attached to the antibody therefore becomes concentrated near 
the cancer cells, improving the effect on those cells and decreasing the effect on non-cancer cells. 
This concentration would generally occur not only near the primary tumor, but also near cancer cells 
that have metastasized to other tissue sites. Furthermore, if the antibody is able to induce 
endocytosis, this will enhance entry of the effector into the cell interior. 

For the purpose of targeting, an antibody specific for the protein of the cancer-associated 
gene is conjugated with a suitable effector component, preferably by a covalent or high-affinity bond. 
Suitable effector components in such compositions include radionuclides such as ,31 l, toxic chemicals 
15 such as vincristine, and toxic peptides such as diphtheria toxin. Other suitable effector components 
include peptides or polynucleotidescapable of altering the phenotype of the cell in a desirable fashion: 
for example, installing a tumor suppresser gene, or rendering them susceptible to immune attack. 

In most applications of antibody molecules in human therapy, it is preferable to use human 
monoclonals, or antibodies that have been humanized by techniques known in the art. This helps 
20 prevent the antibody molecules themselves from becoming a target of the host's immune system. 

An example of how polypeptides embodied in this invention can be effectively used in 
treatment is through vaccination. The growth of cancer cells is naturally limited in part due to immune 
surveillance This refers to the recognition of cancer cells by immune recognition units, particularly 
antibodies and T cells, and the consequent triggering of immune effector functions that limit tumor 
25 progression. Stimulation of the immune system using a particular tumor-specific antigen enhances 
the effect towards the tumor expressing the antigen. Thus, an active vaccine comprising a 
polypeptide encoded by the cDNA of this invention would be appropriately administered to subjects 
having overabundance of the corresponding RNA. There may also be a prophylactic role for the 
vaccine in a population predisposed for developing cancer cells with overabundance of the same 
30 RNA. 

Ways of increasing the effectiveness of cancer vaccines are known in the art (Beardsley, 
MacLean et a!.). For example, synthetic antigens are conjugated to a carrier like keyhole limpet 
hemocyanin (KLH), and then combined with an adjuvant such as DETOX™, a mixture of 
mycobacterial cell walls and lipid A. Any polypeptide encoded by the four novel genes described in 
35 this invention can be used in analogous compositions. 

Methods for preparing and administering polypeptide vaccines are known in the art. Peptides 
may be capable of eliciting an immune response on their own, or they may be rendered more 
immunogenic by chemical manipulation, such as cross-linking or attaching to a protein earner like 
KLH. Preferably, the vaccine also comprises an adjuvant, such as alum, muramyl dipeptides. 
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liposomes, or DETOX™. The vaccine may optionally comprise auxiliary substances such as wetting 
agents, emulsifying agents, and organic or inorganic salts or acids.. It also comprises a 
pharmaceutical^ acceptable excipient which is compatible with the active ingredient and appropriate 
for the route of administration. The desired dose for peptide vaccines is generally from 10 ^g to 1 mg. 
5 with a broad effective latitude. The vaccine is preferably administered first as a priming dose, and 
then again as a boosting dose, usually at least four weeks later Further boosting doses may be given 
to enhance the effect. The dose and its timing are usually determined by the person responsible for 
the treatment. 



1 0 Sequence data and deposits 



The foregoing detailed description provides, inter alia, a detailed explanation of how genes 
associated with cancer can be identified and their cDNA obtained. Polynucleotide sequences for 
CH1-9a1 1-2, CH&-2a13-1, CH13-2a12-1, and CH14-2a16-1 are provided. 

15 T he sequence data listed in this application was obtained by two-directional sequencing, 

except where indicated otherwise. The data are believed to be accurate — nevertheless, it is readily 
appreciated that the techniques of the art as used herein have the potential of introducing occasional 
and infrequent sequence errors. Clones and inserts obtained via PCR may also comprise occasional 
errors introduced during amplification. Nucleotide sequences predicted from database compilations. 

20 and sequence data obtained by one-directional sequencing may also contain occasional errors in 
accordance with the limitations of the underlying techniques. In addition, allelic variations to both 
nucleotide and amino acid sequences may occur naturally or be deliberately induced. Differences of 
any of these types between the sequences provided herein and the invention as practiced may be 
present without departing from the spirit of the invention. 

25 Sequence data for CH8-2a13-1 and CH13-2a12-1 cDNA are believed to comprise the entire 

translated coding sequence, and 5' and 3' untranslated regions corresponding to those found in 
typical mRNA transcripts. Multiple mRNA transcripts may be found depending on the patterns of 
transcript processing in various cell types of interest. Sequence data for CH1-9a11-2 and 
CH14-2a16-1 cDNA comprise a portion of the coding sequence and 3' untranslated regions. 

30 Additional sequence is typically present in the corresponding mRNA transcripts, comprising an 
additional coding region in the N-terminal direction of the protein, and possibly a 5' untranslated 
region. 

Certain embodiments of this invention may be practiced by polynucleotide synthesis 
according to the data provided herein, by rescuing an appropriate insert corresponding to the gene of 
35 interest from one of the deposits listed below, or by isolating a corresponding polynucleotide from a 
suitable tissue source. Various useful probes and primers for use in polynucleotide isolation are 
provided herein, or may be designed from the sequence data. 
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Three deposits have been made on May 31.1 996 with the American Type Culture Collection 
(ATCC). 12301 Parklawn Drive, Rockville, Maryland 20852 under terms of the Budapest treaty. The 
deposits are outlined in Table 2: 



TABLE 2: ATCC Deposits 


BCGF1 

Accession No. 
98074 


Mixture of £. co// with recombinant plasmids of cONA fragments of genes 
associated with breast cancer. The 8 recombinant plasmids may be separated 
by plating on Ampicillin plates and selecting single colonies for analysis by PCR 
using SP6 and T7 primers. 




Gene 


Subclone 


Expected size of PCR product 




CH1-9a11-2 


pch1-1.1 


1.1 kb ~1 






pch1-2.5 


2.5 kb ~~j 




CH8-2a13-1 


pch8-600 


600 bp 






pch8-3k 


3.0 kb 






pch8-4k 


4.0kb 




CH14-2a16-1 


pch 14-800 




800 kb 






pch14-1.6 


1.6 kb 






pch14-1.3 


1.3 kb 


BCGF2 

Accession No. 
97595 


Mixture of Xgt1 0 recombinant phages with cDNA inserts of genes associated 
with breast cancer. The 2 phages may be separated by growing in the E. coli 
host (strain NM514) and plating out for single plaques. These plaques can be 
distinguished by PCR using Xgt1 0 reverse and forward primers. 




Gene 


Phage 


Expected size of PCR product 




CH13-2a12-1 


Xch1 3-3.5 


3.5 kb 




CH14-2a16-1 


Xch1 4-2.5 


2.5 kb 


XBCBT474 

Accession No. 
97594 


cDNA library derived from breast cancer cell line BT474 in XgtIO vector, 
supplemented with a cDNA library from breast cancer cell line 600PE in Xgt1 0 
vector. The cDNA insert sizes range from about 0.5 to 5 kb. 
XBCBT474 is a source of additional cDNA inserts corresponding to 

CH1-9a11-2,CH8-2a13-1,CH13-2a12-1 r orCH14-2a16-1not present in 
BCGF-1 or BCGF-2. 



5 



Sequence databases contain sequences of polynucleotide and polypeptide fragments with 
varybus degrees of identity and overlap with certain embodiments of this invention. The following list 
of accession numbers is provided for the interest of the reader; it is not intended to be comprehensive 
or a limitation on the invention. The database disclosures do not typically indicate use in cancer 
1 0 diagnosis, drug development or disease treatment. 

The following GenBank accession numbers are listed in relation to CH1-9a11-2: dbEST 
N32686; N45113; N36176; N22982; AA278830; H88670; AA235936; AA236951; H26301; N28026; 
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H88063; H88064; D61948; H88718; H26460; AA137920; AA145308; W12952; AA200687; N44164; 
T27279; dbSTS G22044; G04961 . 

The following GenBank accession numbers are listed in relation to CH8-2a13-1: dbNR 

D83780 

5 The following GenBank accession numbers are listed in relation to CH13-2a12-1: dbNR 

U58090; dbEST AA182441; AA253924; AA179755; AA1 12715; AA1 12640; W67977; AA150317; 
W68080; AA150243; AA100446; W69636; H46574; AA245889; AA100651; H77368; AA192778; 
T85671; N32682; T86257; T78239; T77874; AA187866; 233557; R40816; N99802; R19302; 
AA100650; N55904; AA257151 ; H77369; T79014. 
10 The following GenBank accession numbers are listed in relation to CH14-2a16-1: dbEST 

N64802; W56903; N31400; W95674; AA233551; AA233636; N24105; W03447; W25821; AA2 33666; 
AA233647; N67843; D55778; T66839; N55370; N75650; AA280736; H97110; Z19643; H91250; 
AA230765; R93089; T84665; W94857; R92873 

1 5 The examples presented below are provided as a further guide to a practitioner of ordinary 

skill in the art, and are not meant to be limiting in any way. 

Examples 

20 Example 1: SelectingcDNA for messenger RN A that is overabundant In breast cancer ceils 

Total RNA was isolated from each breast cancer cell line or control cell by centrifugation 

through a gradient of guanidine isothiocyanate/CsCI. The RNA was treated with RNase-free DNase 

(Promega. Madison, Wl). After extraction with phenot-chloroform, the RNA preparations were stored 
25 at -70°C. Oligo-dT polynucleotides for priming at the 3' end of messenger RNA with the sequence 

T„NM (where N € {A.C.G} and M e {A,C,G,T}) were synthesized according to standard protocols. 

Arbitrary decamer polynucleotides (OPA01 to OPA20) for priming towards the 5' end were purchased 

from Operon Biotechnology, Inc., Alameda, CA. 

The RNA was reverse-transcribed using AMV reverse transcriptase (obtained from BRL) and 
30 an anchored oligo-dT primer in a volume of 20 ^L, according to the manufacturer's directions. The 

reaction was incubated at 370C for 60 min and stopped by incubating at 950C for 5 min. The cDNA 

obtained was used immediately or stored frozen at -70°C. 

Differential display was conducted according to the following procedure: 1 pL cDNA was 

replicated in a total volume of 1 0 \il PCR mixture containing the appropriate T t ,NM sequence, 0.5 TM 
35 of a decamer primer, 200 TM dNTP, 5 TCi ( 35 S]-dATP (Amersham), Taq polymerase buffer with 2.5 

mM MgCl2 and 0.3 unit Taq polymerase (Promega). Forty cycles were conducted in the following 

sequence: 94°C for 30 sec. 40°C for 2 min, 72°C for 30 sec; and then the sample was incubated at 
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72°C for 5 min. The replicated cONA was separated on a 6% potyacrylamide sequencing gel. After 
electrophoresis, the gel was dried and exposed to X-ray film. 

The autoradiogram was analyzed for labeled cDNA that was present in larger relative amount 
in all of the lanes corresponding to breast cancer cells, compared with all of the lanes corresponding 
5 to control cells. Figure 1 provides an example of an autoradiogram from such an experiment. Lane 
1 is from non -proliferating normal breast cells; lane 2 is from proliferating normal breast cells; lanes 
3 to 5 are from breast cancer cell lines BT474, SKBR3, and MCF7. The left and right side shows 
the pattern obtained from experiments using the same T n NM sequence (T n AC), but two different 
decamer primers. The arrows indicate the cDNA fragments that were more abundant in all three 

1 0 tumor lines compared with controls. 

The assay illustrated in Figure 1 was conducted using different combinations of oligo-dT 
primers and decamer primers. A number of differentially expressed bands were detected when 
different primer combinations were used. However, not all differences seen initially were 
reproducible after re-screening. We therefore routinely repeated each differential display for each 

15 primer combination. Only bands showing RNA overabundance in at least 2 experiments were 
selected for further analysis. 

It is preferable to include in the differential display experiment RNA derived from uncultured 
normal mammary epithelial cells (termed "organoids"). These cells are obtained from surgical 
samples resected from healthy breast tissue, which are then coaxed apart by blunt dissection 

20 techniques and mild enzyme treatment. Using organoids as the negative control, 33 cDNA 
fragments were isolated from 15 displays. 

Example 2: Sub-selecting cDNA that corresponds to genes that are duplicated In breast 
cancer cells 

25 

cDNA fragments that were differentially expressed in the fashion described in Example 1 
were excised from the dried gel and extracted by boiling at 950C for 10 min. Eluted cDNA was 
recovered by ethanol precipitation, and replicated by PCR. The product was cloned into the pCRII 
vector using the TA cloning system (Invitrogen). 

30 EcoRI digested placenta DNA, and EcoRI digested DNA from the breast cancer cell lines 

BT474, SKBR3 and ZR-75-30 were used to prepare Southern blots to screen the cloned cDNA 
fragments. The cloned cDNA fragments were labeled with [32PHCTP, and used individually to probe 
the blots. A larger relative amount of binding of the probe to the lanes corresponding to the cancer 
cell DNA indicated that the corresponding gene had been duplicated in the cancer cells. The labeled 

35 cDNA probes were also used in Northern blots to verify that the corresponding RNA was 
overabundantin the appropriate cell lines. 

To determine whether the cDNA fragments obtained by this selection procedure 
corresponded to novel genes, a partial nucleotide sequence was obtained using M13 primers. 
Each sequence was compared with the known sequences in GenBank. In initial experiments, 5 of 
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the first 7 genes sequenced were mitochondrial genes. To avoid repeated isolation of 
mitochondrial genes, subsequent screening experiments were done with additional lanes in the 
ONA blot analysis for EcoRI digested and Hind\\\ digested mitochondrial DNA. Any cONA fragment 
that hybridized to the appropriate mitochondrial restriction fragments was suspected of 
5 corresponding to a mitochondrial gene, and not analyzed further. 

From the 33 cDNA fragments detected from differential displays using organoid mRNA, 12 
were subcloned. Of these 12, 6 detected suitable gene duplications in the appropriate cell lines. 
Three cONA failed to detect duplicated genes, and 3 appeared to correspond to mitochondrial 
genes. Sequence analysis of the 6 suitable cDNA fragments showed no identity to any known 
10 genes. 

To obtain longer cDNA corresponding to the cDNA fragments with novel sequences, the 
fragments were used as probes to screen a cDNA library from breast cancer cell line BT474, 
constructed in lambda GT10. The longer cDNA obtained from lambda GT10 were sequenced 
using lambda GT10 primers. The chromosomal locations of the cDNAs were determined using 
1 5 panels of somatic cell hybrids. 

Four of the 6 novel cDNA identified so far have been processed in this fashion. The 
probes used to obtain the 4 new breast cancer genes are shown in Table 3. 



| TABLE 3: Primers used for Differentia! Display j 


| cDNA 


Oiigo-dT primer 


Arbitrary primer 


| CH1-9a11-2 


T,,CC (SEQIDNO:9) 


SEQ ID NO:11 


I CH8-2a13-1 


T n AC (SEQIDNO:10) 


SEQ IO NO:12 


| CH13-2a12-1 


T„AC (SEQIDNO:10) 


SEQIDNO:13 


| CH14-2a16-1 


T t1 AC <SEQIDNO:10) 


SEQ ID NO:14 



20 

Example 3: Using the cDNA to test panels of breast cancer cells 

To determine the proportion of breast cancers in which the putative breast cancer genes 
were duplicated, or showed RNA overabundance without gene duplication, the four cDNA obtained 
25 according to the selection procedures described were used to probe a panel of breast cancer cell 
lines and primary tumors. 

Gene duplication was detected either by Southern analysis or slot-blot analysis. For 
Southern analysis. 10 ng of EcoRI digested genomic DNA from different cell lines was 
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electrophoresedon 0.8% agarose and transferred to a HYBOND™ N+ membrane (Amersham). The 
niters were hybridized with 32P-labeled cDNA for the putative breast cancer gene After an 
autoradiogram was obtained, the probe was stripped and the blot was re-probed using a reference 
probe to adjust for differences in sample loading. Either chromosome 2 probe D2S5 or chromosome 
5 21 probe D21 S6 was used as a reference. Densities of the signals on the autoradiograms were 
obtained using a densitometer (Molecular Dynamics). The density ratio between the breast cancer 
gene and the reference gene was calculated for each sample. Two samples of placental ON A digests 
were run in each Southern analysis as a control. 

For slot-blot analysis, 1 ng of genomic DNA was denatured and slotted on the HYBOND™ 
1 0 membrane. D2 1 S5 or human repetitive sequences were used as reference probes for slot blots. The 
density ratio between the breast cancer gene and the reference gene was calculated for each sample. 
10-15 samples of placental DNA digests were used as control. Amongst the control samples, the 
highest density ratio was set at 1.0. The density ratio of the tumor cell lines were standardized 
accordingly. An arbitrary cut-off for the standardized ratio (typically 1.3) was defined to identify 
15 samples in which the putative gene had been duplicated. Each of the cell lines in the breast cancer 
panel was scored positively or negatively for duplication of the gene being tested. 

Some of the cell lines in the panel were known to have duplicated chromosomal regions from 
comparative genomic hybridization analysis. In instances where the cDNA being used as probe 
mapped to the known amplified region, the cDNA indicated that the corresponding gene had also 
been duplicated. However, duplicated genes were also detected using each of the four cDNAs in 
instances where comparative genomic hybridization had not revealed any amplification. * 

Because of the nature of the technique, the standardized ratio calculated as described 
underestimates the gene copy number, although it is expected to rank in the same order. For 
example, the standardized ratio obtained for the c-rn/c gene in the SKBR3 breast cancer cell was 5.0. 
25 However, it is known that SKBR3 has approximately 50 copies of the c-myc gene. 

To test for overabundance of RNA, 10 M g of total RNA from breast cancer cell lines or primary 
breast cancer tumors were eiectrophoresed on 0.8% agarose in the presence of the denaturant 
formamide, and then transferred to a nylon membrane. The membrane was probed first with 
32P-labeled cDN A corresponding to the putative breast cancer gene, then stripped and reprobed with 
32P-labeled cDNA for the beta-actin gene to adjust for differences in sample loading. Ratios of 
densities between the candidate gene and the beta-actin gene were calculated. RNA from three 
different cultured normal epithelial cells were included in the analysis as a control for the normal level 
of gene expression. The highest ratio obtained from the normal cell samples was set at 1 .0, and the 
ratios in the various tumor cells were standardized accordingly. 
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Examp!e4: Chromosome 1 gone CH1-9a11-2 

One of the cONA obtained through the selection procedures of Examples 1 and 2 
5 corresponded to a gene that mapped to Chromosome 1 . 

Table 4 summarizes the results of the analysis for gene duplication and RNA overabundance. 
Both quantitative and qualitative assessment is shown. The numbers shown were obtained by 
comparing the autoradiograph intensity of the hybridizing band in each sample with that of the 
controls. Several control samples were used for the gene duplication experiments, consisting of 
10 different preparations of placental DNA. The control sample with the highest level of intensity was 
used for standardizing the other values. Other sources used for this analysis were breast cancer cell 
lines with the designations shown. For reasons stated in Example 3. the quantitative number is not a 
direct indication of the gene copy number, although it is expected to rank in the same order. Similarly, 
up to 6 control samples were used for the RNA overabundance experiments, consisting of different 
15 preparations of breast cell organoids which had been maintained briefly in tissue culture until the 
experiment was performed. The control sample with the highest level of intensity was used for 
standardizing the other values. Each cell line was scored + or - according to an arbitrary cut-off value. 
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TABLE 4: Chromosome 1 Gene In 
: Breast Cancer Cell Lines 


Source 


CH1-9a11-2 

Gene 
Duplication 




CH1«9a11-2 
RNA Overabundance 












5-2kb 


44kb 




Normal 


- 


1.00* 


- 


1.00" 


- 


1.0** 


BT474 


+ 


2.70 


+ 


1.57 


+ 


3.7 


ZR-75-30 




265 




nd 




nd 


MDA453 


+ 


2.86 


+ 


5.79 


+ 


6.2 


MDA435 




3.72 


- 


0.89 


+ 


2.4 


SKBR3 


+ 


1.86 


- 


0.94 


+ 


2.9 


600PE 




1.72 




4.47 


+ 


6.8 


MDA157 


+ 


1.49 


- 


1.08 


+ 


1.4 


MCF7 


4- 


1.95 




nd 




nd 


DU4475 


+ 


2.02 




1.13 


+ 


1.5 


MDA231 




1.23 


+ 


1.47 






BT20 




1.09 




0.83 


+ 


1.9 


T47D 




1.05 




nd 




nd 


UACC812 




0.67 


+ 


1.57 


+ 


1.8 


MOA134 




1.19 




5.04 




7.1 


CAMA-1 




1.02 




2.51 




7.2 


Incidence 
(%) 


9/15 
(60%) 


7/12 
(68%) 


11/12 
(92%) 



Gene duplication or RNA overabundance; - no duplication or overabundance: nd = not done 
Degree of gene duplication is reported relative to placental ONA preparations. 
Degree of RNA overabundance is reported relative to the highest level observed for 

several cultures of normal epithelial cells. Two hybridizing species of RNA 

are calculated and reported separately. 



The gene corresponding to the CH1-9a1 1-2 cDNA was duplicated in 9 out of 15 (60%) of the 
breast cancer cell lines tested, compared with placental DNA digests (P3 and P12). The sequence of 
the 1 15 bases from the 5* end of the cDNA fragment (SEQ. ID NO:1) is shown in Figure 22. There 
was no substantial homology to any known gene in GenBank. One of the three possible reading 
5 frames was found to be open, with the predicted amino acid shown in Figure 22 (SEQ. IO NO:2). 
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The CH1-9a11-2 gene was further characterized by obtaining additional sequence 
^formation. A X-GT10 cONA library from the breast cancer cell line BT474 (Example 2) was 
screened using the initial cONA insert, and a clone with a 2.5 kilobase insert was identified The 
identified done was subcloned into p.asmid vector pCRII. T7 and Sp6 primers for regions flanking the 
5 cDNA inserts were used as initial sequencing primere: 

T7 primer. (SEQ. ID NO:42) 

5-TAATACGACTCACTATAGGGAGA-3' 
Sp6 primer (SEQ. ID NO:43) 

10 5'-CATACGATTTAGGTGACACTATAG-3' 

Sequencing continued by walking along the region of interest by standard techniques using 
sequencing primers based on data already obtained. Primers used in sequencing are designated 1- 
16 in Figure 7. 

A second clone (designated pCH1-1.1) overiapping on the 5 end was obtained using 
CLONTECH Marathon™ cDNA Amplification Kit. A map showing the overiapping regions is provided 
«n Figure 6. Briefly, two DNA primers designated CH1a and CH1b (Figure 7) were synthesized 
Polyadenylated RNA from breast cancer cell line 600PE was reverse transcribed using CH1b primer. 
After second strand synthesis, adaptor DNA provided in the kit was ligated to the double-stranded 
cDNA. The 5 end cDNA of CH1-9a11-2 was then amplified by PGR using primers CH1a and API 
(provided in the kit). To increase the specificity of the PCR products, the first PCR products were 
PCR reamplified using nested primers CHIa and AP2 (provided in the kit). The PCR products were 
cloned into pCRII vector (Invitrogen) and screened with CH 1 -9a1 1 -2 probe. 

The sequence of 3452 base pairs between the 5" end of P CH1-1.1 and the poly-A tail of CH1- 
9a 11 -2 was determined by standard sequencing techniques. The DNA sequence is shown in Figure 
8 (SEQ. ID NO:15). The longest open reading frame is in frame 1 (bases 1-1875). and codes for 624 
amino acids before the stop codon. The corresponding amino acid sequence of this frame is shown 
in the upper panel of Figure 9 (SEQ. ID NO:16). The partial sequence predicted for the translated 
protein is listed the low panel of Figure 9 (SEQ. ID NO:17). Bases 1876 to the end of the sequence 
are believed to be a 3' untranslated region. A hydrophobicity analysis identified a putative membrane 
insertion or membrane spanning region at about amino acids 382-400. indicated in Figure 9 by 
underlining. 

Figure 23 is a listing of additional cDNA sequence obtained for CH1-9a11-2. comprising 
approximately 1934 base pairs 5" from the sequence of Figure 8. The additional sequence data was 
obtained by rescuing and amplifying two further fragments of CH1-9a11-2 cDNA. Nested primers 
were designed -100 base pairs downstream from the 5" end of the known sequence. The primere 
were used in a nested amplification assay using AP1 and AP2. using the CLONTECH Marathon™ 
cDNA Amplification Kit as described above. The template for the firet upstream fragment was 
reverse-transcribed polyadenylated RNA from breast cancer cell line 600PE . as described earlier. 
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This fragment was sequenced, and another set of nested primers was designed. The template for the 
next upstream fragment was a Marathon™ ready cDNA preparation from human testes also supplied 
by CLONTECH. 

The nucleotide sequence shown in Figure 23 comprises an open reading frame through to 
the 5' end. Figure 24 shows the corresponding protein translation. Between about another 500-1000 
bases are predicted to be present in the CH1-9a11-2 direction, with the protein encoding sequence 
beginning somewhere within this additional sequence. Sequencing of the encoding region is 
completed by obtaining additional CH1-9a1 1-2 fragments in this direction. 

A GENINFO® BLAST search of nucleotide and peptide sequence databases was performed 
through the National Center for Biotechnology Information on February 23. 1996. Short segments of 
homology with other reported human sequences were found at the nucleotide level (<500 base pain.) 
but none with any ascribed function in the respective identifier. At the amino acid level, no identity 
higher than 30% was found with any reported eukaryotic sequences. 

A CH1-9a11-2 cloned insert has been used to probe the level of relative expression in 
polyadenylated RNA from a panel of tissue sources. The RNA was obtained already prepared for 
Northern b.ot analysis (CLONTECH Catalog # 7759-1. 7760-1 and 7756-1.) The manufacturer 
produced the blots from approximately 2 M g of poly-A RNA per lane, run on a denaturing 
formaldehyde 1-2% agarose gel. transferred to a nylon membrane, and fixed by UV irradiation. The 
relative CH1-9a1 1-2 expression observed at the RNA level is shown in Table 5: 
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I TABLE 5: Northern blot analysis 


Tissue 


GH1-9a11-2 mRNA 


heart 


++ 


] brain 


+ 


placenta 


++ 


lung 


+/- 


liver 
skeletal muscle 


+/- 


kidney 


+ 4 
+ / 


I pancreas 


+++ 1 


[ spleen 


+ 


thymus 


+ 


prostate 


++ 


testis 


+++ 


ovary 


++ 


small intestine 


+ 


j colon 


+/- 


peripheral blood 


+/- 


++++ Very high 

+ +* High;: 

Medium 
■ Very low 



Redely e.evated .evels of expression were observed in heart, placenta, pancreas, prostate testis 
and ovary. The level of expression in breast cancer cel. lines is also relatively high (about ++++ on 
the scale), srnce the Northern analysis performed on these lines (described above) was conducted on 
total cellular RNA. of which po,yadeny«ated RNA constitutes only about 5%. It is likely that the CH1- 
9311-2 gene is involved in a biotogica. process that is typical to the tissue types showing medium to 
h.gh levels of expression, which may relate to increased tissue growth or metabolism. 

Since the obtained sequence is shorter than the apparent size of mRNA observed in 
Northern analysis (Table 1). an additional polynucleotide segment is believed to be present at the 5" 
end of the sequence shown in SEQ. .D NO:15. Further sequence data at the 5' end is deduced by 
obta (ni ng additional cloned cONA using standard techniques. Briefly, in one approach mRNA from 
breast cancer eel, «ines M DA^53 and/or 600PE are ctoned and screened using primers based on 
sequence data from SEQ. ID NO.15. Two nested primers of about 20 nucleotides are prepared the 
■nnermost about 150 base pairs from the 5" end. and the outermost about 170 base pairs from the 8* 
1 5 end. The outermost primer is used to synthesize a first cONA strand complementary to the mRNA in 
the upstream direction. Second strand synthesis is performed using reagents in a CLONTECH 
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Marathon™ cDNA amplification kit according to manufacturer's directions. The double-stranded DNA 
is then ligated at the 5' end of the coding sequence with the double-stranded adaptor fragment 
provided in the kit. A first PGR amplification (about 30 cycles) is performed using the first adapter 
primer from » ne kit and the outermost RNA-specific primer, and a second amplification (about 30 
cycles) ,s performed using the second adapter primer and the innermost RNA-specific primer In an 
alternative approach, a CLONTECH RACE-READY single-stranded cDNA from human placenta is 
PCR amplified using nested 5' anchor primers in combination with the outermost and innermost RNA- 
specific primers. Amplified DNA obtained using either approach is analyzed by gel electrophoresis 
and cloned into p.asmid vector pCRII. Cones are screened, as necessary, using the 2.5 kilobase 
CH1-9a11-2 insert. Clones corresponding to full-length mRNA (4.5 kb or 5.5 kb Table 1) or cDNA 
fragments overlapping at the 5' end are selected for sequencing. Compared with the 4 5 kb form 
add,t 10 nal polynucleotide segments may be present in the 5.5 kb form within the encoding region, or in 
the 5" or 3' untranslated region. 

15 Example 5: Chromosome 8 gene CH8-2a1 3-1 

One of the cDNA obtained corresponded to a gene that mapped to Chromosome 8 Figure 2 
shows the Southern blot analysis forthe corresponding gene in various DNA digests. Lane 1 (P12) is 
the control preparation of placental DNA; the rest show DNA obtained from human breast cancer cell 
fines. Panel A shows the pattern obtained using the 32P-labeled CH8-2a13-1 cDNA probe Panel B 
shows the pattern obtained with the same blot using the 32P-labeled D2S6 probe as a loading control 
The sizes of the restriction fragments are indicated on the right. 

Figure 3 shows the Northern blot analysis for RNA overabundance. Lanes 1-3 show the level 
of expression in cultured normal epithelial cells. Lanes 4-19 show the level of expression in human 
breast cancer cell lines. Panel A shows the pattern obtained using the CH8-2a13-1 probe; pane. B 
shows me pattern obtained with beta-actin cDNA. a loading control. 

The results are summarized in Table 6. The scoring method is the same as for Example 4 
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Source 



TABLE 6: Chromosome 8 Genes 
In Breast Cancer Cell Lines 



CH8-2a13-1 
Gene Duplication 



GH8-2a13-1 
RNA Overabundance 



c-myc 
Gene Duplication 



Normal 

SKBR3 

ZR-75-30 

BT474 

MOA157 

MCF7 

CAMA-1 

MDA361 

MOA468 

T47D 

MDA453 

MDA134 

MDA435 

600PE 

UACC812 

MDA231 
DU4475 
BT468 
BT20 



+ 
+ 

nd 

+ 
+ 
+ 
+ 



1.00* 
4.25 
3.82 
1.53 
2.02 
1.84 
3.62 
2.00 



1.41 
1.83 
1.30 
2.15 
0.95 
1.25 

0.80 
0.85 
0.37 
0.95 



+ 

nd 
+ 
+ 
+ 



1.00** 
4.30 

1.72 
3.39 
4.92 
2.14 
1.74 
4.50 

1.58 
3.10 
3.70 
4.94 
2 04 
2.40 

1.28 
0.88 
0.70 
0.82 



+ 
+ 

+ 
+ 

nd 
nd 



1.00* 
4.73 
2.24 
1.76 
1.39 
3.10 
1.61 



1.02 
0.90 
0.88 
1.00 
0.54 
0.74 

1.27 
0.50 
0.23 



Incidence 



12/17 
(71%) 



14/17 
(82%J 



7/16 
(44%) 



1 not done. 



Gene duplication or RNA overabundance; - no duplication or overabundance; nd 
Degree of gene duplication is reported relative to placental ONA preparations. 
Degree of RNA overabundance is reported relative to the highest level observed for several cultures of 
normal epithelial cefls. 



The gene corresponding to CH8-2a13-1 showed clear evidence of duplication in 12 out of 17 
(71%) of the cells tested. RNA overabundance was observed in 14 out of 17 (82%). Thus, 11% of 
the cells had achieved RNA overabundance by a mechanism other than gene duplication. 

Since the known oncogene c-myc is located on Chromosome 8, the Southern analysis was 
also conducted using a probe for c-myc. At least 2 of the breast cancer cells showing duplication of 
the gene corresponding to CH8-2a13-1 gene did not show duplication of c-myc. This indicates that 
the gene corresponding to CH8-2a13-1 is not part of the myc amplicon. 

The sequence of 150 bases from the 5* end of the cDNA fragment is shown in Figure 22 
(SEQ ID NO:3) There was no substantial homology to any known gene in GenBank. One of the 
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three possible reading frames was found to be open, with the amino acid sequence shown in Figure 
22 (SEQ ID NO:4). 

The CH8-2a13-1 gene was further characterized by obtaining additional sequence 
information. A X-GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was 
5 screened using the initial cDNA insert, and clones with a 3.0 kb and a 4.0 kb insert were identified. 
The two identified clones were subclone* into plasmid vector pCRII. T7 and Sp6 primers for regions 
flanking the cDNA inserts were used as initial sequencing primers. Sequencing continued by walking 
along the region of interest by standard techniques, using sequencing primers based on data already 
obtained. The two inserts were found to overlap (Figure 6). Primers used are those designated 1-25 
10 in Figure 10. 

A third clone of about 600 bp (designated pCH8-€00) overlapping on the 5" end (Figure 6) 
was obtained using CLONTECH Marathon'" cDNA Amplification Kit. Briefly, two DNA primers CH8a 
and CH8b (Figure 10) were synthesized. Polyadenylated RNA from breast cancer cell line BT474 
was reverse transcribed using CH8b primer. After second strand synthesis, adaptor ONA provided in 
the kit was ligated to the double-stranded cDNA. The 5" end cONA of CH8-2a13-1 was then amplified 
by PCR using primers CH8a and API (provided in the kit). To increase the specificity of the PCR 
products, the first PCR products were PCR reamplified using nested primers CH8a and AP2 
(provided in the kit). The PCR products were cloned into pCRII vector (Invitrogen) and screened with 
CH8-2a13-1 probe. 

By sequencing relevant portions of the three clones, a nucleic acid sequence of 3982 base 
pain; between the 5" end and the poly-A tail of CH8-2a13-1 was determined. The DNA sequence is 
shown in Figure 1 1 (SEQ. ID NO:18). Bases 1-152 are believed to be a 5" untranslated region. The 
longest open reading frame is in frame 3 from base 153 to 3911. and codes for 1252 amino acids 
before the stop codon. The corresponding amino acid sequence of this frame is shown in the upper 
pane, of Figure 12 (SEQ ID NO:19). The sequence predicted for the translated protein is shown in 
the lower panel of Figure 12(SEQ. ID NO:20). 

A GENINFO® BUVST search of nucleotide and peptide sequence databases was performed 
through the National Center for Biotechnology Information on March 26. 1996. The sequences were 
found to be about 99% identical at the nucleotide and amino acid level with bases 343-4103 of 
KIAA0196 protein (N. Nomura et al.. in press; sequence submitted to the DOBJ/EMBLfGenBank 
databases on March 4. 1996). The K.AA0196 was one of 200 different cDNA cloned at random from 
an immature male human myeloblast cell line. KIAA0196 has no known biological function, and is 
described by Nomura et al. as being ubiquitously expressed. 

A fourth clone of about 600 bp overlapping pCH8-600 at the 5" end has also been obtained. 
Briefly, a DNA primer was synthesized corresponding to about the first 20 nucleotides at the 5' of the 
predicted cDNA sequence, and used along with a primer based on the pCH8-600 sequence to 
reverse-transcribe RNA from breast cancer cell line BT474. The product was cloned into pCRII vector 
(Invitrogen) and screened with a CH8-2a13-1 probe. The new clone is sequenced along both strands 
to obtain additional 5' untranslated sequence data for the cDNA. The predicted compiled cDNA 
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nucleotide sequence of CH8-2a13-1 cONA is shown in Figure 13 (SEQ. ID NO:21) The 
corresponding amino acid sequence of this frame is shown in Figure 14 (SEQ. ID NO:22) A 
polynucleotide comprising the compiled sequence is assembled by joining the insert of this fourth 
clone to pCHS^k within the shared region. Briefly, CH8-4k is cut with Xbal and A/ofl. The fourth 
5 clone is cut with flamHI and Xbal The ligated polynucleotide is then inserted into pCRII cut with 
BamH\ and A/ofl. 

A CH8-2a13-1 cloned insert has been used to probe the level of relative expression in 
polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4. 
The relative CH8-2a13-12 expression observed at the mRNA level is shown in Table 7: 



; TABLE 7: Northern blot analysis 


Tissue 


CHt-9a11-2 mRNA 


heart 


- : 


brain 




I placenta 


+ 


lung 


+ 


liver 


+/- 


skeletal muscle 


+/- 


kidney 


+/- 


pancreas 


+/- 


spleen 


+ 


thymus 


+ 


prostate 




testis 




ovary 


♦ 


small intestine 


+ 


colon 


+ 


peripheral Wood 




++++ iVeryhigh 
+++ H^h 

Medium . 

hM^, + , LOW 

+/- ,S/W)cm . :. j 



Relative levels of expression observed were as follows: Low levels of expression were observed in 
adult peripheral blood leukocytes (PBL), brain, placenta, lung, liver, skeletal muscle, kidney, and 
pancreas. Medium levels of expression were observed in adult heart, spleen, thymus, prostate, testis. 
15 ovary, small intestine, and colon. High levels of expression were observed in four fetal tissues tested: 
brain, lung, liver and kidney. The level of expression in breast cancer cell lines is relatively high 
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(about on the scale), since the Northern analysis performed on these lines was conducted on 
total cellular RNA. It is likely that the CH8-2a13-1 gene is involved in a biological process that is 
typical to the tissue types showing medium to high levels of expression, which may relate to increased 
tissue growth or metabolism. 

5 

Example 6: Chromosome 13 gene CH13-2a12-1 



One of the cDNA obtained corresponded to a gene that mapped to Chromosome 13. Figure 
4 shows the Southern blot analysis for the corresponding gene in various DNA digests. Lanes 1 and 
10 2 are control preparations of placental DNA; the rest show DNA obtained from human breast cancer 
cell lines. Panel A shows the pattern obtained using the CH13-2a12-1 cDNA probe; panel B shows 
the pattern using D2S6 probe as a loading control. The sizes of the restriction fragments are 
indicated on the right. 

Figure 5 shows the Northern blot analysis for RNA overabundance of the CH13-2a12-1 gene. 
1 5 Lanes 1-3 show the level of expression in cultured normal epithelial cells. Lanes 4-19 show the level 
of expression in human breast cancer cell lines. Panel A shows the pattern obtained using the 
CH13-2a12-1 probe; panel B shows the pattern obtained with beta-actin cDNA, a loading control. The 
apparent size of the mRNA varied depending upon conditions of electrophoresis. Full-length mRNA is 
believed to occur at sizes of about 3.2 and 3.5 kb. 

20 Tne resu,ts of tne R NA abundance comparison are summarized in Table 8. The scoring 

method is the same as for Example 4. 
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TABLE 8: Chromosome 13 Gene 
In Breast Cancer Cell Lines 


Source 


CH13-2a12-1 


CH13-2a12-1 




Gene duplication 


RNA Overabundance 


Normal 




1.00* 




■ — ■ 

1.00** 


600PE 


+ 


2.18 


+ 


5.57 


BT474 


+ 


1.60 




3.20 


SKBR3 




1.58 




4.25 


MDA157 




2.21 




3.76 


CAMA-1 




1 41 


4- 


1.99 


MDA231 




1.65 


♦ 


2.09 


T470 


+ 


1.23 




1.20 


MDA468 


nd 




+ 


6.90 


MOA361 


nd 




+ 


2.59 


MOA435 




0.59 


+ 


3.41 


MDA134 




0.53 


+ 


2.59 


DU4475 




0.75 




1.79 


MDA453 




0.89 


+ 


1.97 


BT20 




0.37 




1.04 


l MCF7 




0.29 




1.03 


UACC812 




0.30 




0.39 


BT468 




0.47 


nd 




ZR-75-30 




0.70 


nd 




Incidence 


7/16 




13/16 






(44%) 




(81%) 





Gene duplication or RNA overabundance; - no duplication or overabundance; nd = not done 
Degree of gene duplication is reported relative to placental DMA preparations. 

Degree of RNA overabundance is reported relative to the highest level observed for several cultures 
of normal epithelial cells. 



The gene corresponding to CH13-2a12-1 was duplicated in 7 out of 16 (44%) of the cells 
tested. Three of the positive cell lines (600PE, BT474, and MDA435) had been studied previously by 
comparative genomic hybridization, but had not shown amplified chromatin in the region where CH13- 
10 2A12-1 has been mapped in these studies. 

RNA overabundance was observed in 13 out of 16 (81%) of the cell lines tested. Thus, 37% 
of the cells had achieved RNA overabundance by a mechanism other than gene duplication. 
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Cells from primary breast tumors have also been analyzed them for duplication of the 
chromosome ,3 gene. Ten of the 82 tumors ana.yzed (,», were positive, confirming that 
duplication of this gene is not an artifact of in vitro culture. 

The sequence of 107 bases from the 5' end of the 1.5 kb cDNA fragment is shown in Figure 
22 (SEQ ID NO:5). There was no substantia, homology to any known gene in GenBank One of the 
three possible reading frames was found to be open, with the predicted amino acid sequence shown 
in Figure 22 (SEQ 10 NO:6). 

The CH13-2a12-1 gene was further characterized by obtaining additiona. sequence 
■nformation. A X-GT10 cONA library from the breast cancer cell line BT474 (Example 2, was 
screened using the initial cONA insert, and clones with a 3.5 kilobase and a 1.6 kilobase insert were 
•denied. The two identified clones were subcloned into plasmid vector pCR.I. T7 and Sp6 primers 
for regions flanKng the cONA inserts were used as initial sequencing primers. Sequencing continued 
by wa. tang along the region of interest by standard techniques, using sequencing primers based on 
data already obtained. The two inserts were fend to overlap (Figure 6). Primers used during 
1 5 sequencing are shown in Figure 1 5. 

By sequencing relevant portions of the 3.5 and 1.6 kb clones, a nuclei acid sequence of 
3339 base pa.rs between the 5" end and the poly-A tail of CH13-2a12-1 was determined The DNA 
sequence , s shown in Figure 16 (SEQ. ,0 NO:23). Bases 1-520 are believed to be a 5" untrans.ated 
reg,on. The longest open reading frame is in frame 2 from base 521 to 1838. and codes for 611 
am.no adds before the stop codon. The corresponding amino acid sequence of this frame is shown 
■n the upper pane, of F.gure 17 (SEQ. .0 NO:24) The sequence predicted for the transited protein is 
shown ,n the lower pane, of Figure 17 (SEQ. ID NO:25). Bases 1838 to 3339 of the nucleotide 
sequence are befieved to be a 3' untrans.a,ed region, which is present in the 3.5 kb insert The 3 5 kb 

25 r 2 Tr? 56 a sp,ice vanant (Fi9ure 6) ' in *** •* 3 ' re9i °" «-* ° f -~ 

to 1838-2797 in the sequence. 

' ' GENINF0 ® *™ — ' « «*— -tf papbd, scueaca dafcbases was p^o™, 
»*aagb «,e NaSona, C«er fo, B^ohaotogy »„ Ma[ch 26 . |996 ^ 

bama**,, « 01hef le<Mea huma „ seviaan ^ ^ a ^ mxMk ^ (<6m ^ 

ba. nada an, asedbad luacbaa in .ba .aspaciva k>aa«a,. Al »,e amina acid ,ava! ,ba saaaanca 
was ,aa„d ,a sba,a 33% M a«,s and 54 % posbvas « 226 re s«aas a, ,ba . „ „L o, 
e^aa, im p«* baa baaa ia^ad k ta <*, cyca or c. a, W aa 

K,p,a=a W Ha , EM Hadaacac*, Tba CH,3. 2 a I2 ., aane is aaspaa.ed o, a a* ,„ JL g 
ca. pn^abaa -Cdrda*, ca, paa^- ,„ „ s ^ TOans ^ a „ J 
'e»« »f geaa aiprassiaa at the RNA a, p™** ,e,al ^ h a bigba, „ ^ raK of M , 
pra^bdn. „ viae .ad*, coa.paaad wab aa.s »* aa obvawisa .Mb, phe ao*pa. Tbara * a, S » a 

raaapta, rabbit ^ ^ , BumatorakaJ<te)i „ „ a|) ^ _ 
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sequence, whereas none has been detected in CH13-2a12-1. Nevertheless, it is possible that the 
CH13-2a12-1 protein product has a Ca~ binding or Ca" mobilizing function. 

A CH1 3-2a12-1 cloned insert has been used to probe the level of relative expression in 
polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4. 
The relative CH13-2a12-1 expression observed at the mRNA level is shown in Table 9: 



TABLE 9: Northern blot analysis 



tissue 




heart 


++++ 


brain 


+ 


placenta 


++ 


lung 


+ 


liver 


++ 


skeletal muscle 


++++ 


kidney 


+ 


pancreas 




spleen 


++ ] 


thymus 




prostate 


++ 


testis 


+++ 


ovary 


++ 


small Intestine 


++ 


| colon 


+ 


peripheral blood 


+ 


. very high 
+++ UHigh p . 
Medium :: 

" H;very,iow:: ; . v • 



10 



15 



Relatively elevated levels of expression were observed in heart, skeletal muscle and testis. 
The level of expression in breast cancer cell lines is relatively high (about on the scale), since 
the Northern analysis performed on these lines was conducted on total cellular RNA It is likely that 
the CH13-2a12-1 gene is involved in a biological process that is typical to the tissue types showing 
medium to high levels of expression, which may relate to increased tissue growth or metabolism. 

Fragments corresponding to the CH13-2a12-1 gene have also been used to screen cell lines 
derived from other types of cancer. Southern analysis showed that about 1 out of 4 breast cancer cell 
lines tested have gene duplication of CH13-2a12-1. Northern analysis showed that about 3 out of 6 
lines tested have overexpression of the corresponding RNA transcript. 
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Example 7: Chromosome 14 gene CH14-2a16-1 



10 



of the analysis are summarized in Table 10. The scoring method is 



Source 



incidence 



the same as for Example 4. 



TABLE 10: Chromosome 14 Gene 
in Breast Cancer Cell Uhes 



CH14r2a16r1 
Gene duplication 



Normal 




1.00* 


BT474 
MCF7 


I + 
j ♦ 


2.89 
1.35 
2.58 
2.28 
1.52 
2.23 


SKBR3 

T47D 

MOA157 


+ 


UACC812 


+ 


MDA361 




0.97 
1.58 


MDA453 


+ 


BT20 




600PE ! 
MOA231 


+ 


0.94 
1.66 
0.92 
0.87 
0.46 
0.77 


CAMA-1 




DU4475 




BT468 | 




M0A134 






CH14-2ai6-1 
RNA pyerabundance 



8/15 
(53%) 





1.00 


+ 


2.57 


+ 


1.88 


+ 


2.19 


nd 




+ 


2.52 


nd 




+ 


1.43 


+ 


5.92 




1.07 


+ 


2.00 




2.19 




0.71 


+ 


1.33 


nd 






7.17 




10/12 
(83%) 



misuse- ' ^*-^r- — -« — ^ 



^^^^ i "9»CH'4. 2 a,6.1„asd U p,lca te < lin eo u ,o ( ,5,5 3 y, 0 , lh „ 
tested. The seouenr^ nf aaa ^ ^ * J/o ' of the ce s 

« (SEQ ,D NoTZL ' ^ " ** * — » * Figure 22 

( EQ NO.7). There was no aubsUWal homology to an y Wo gene In GenBanK On , 1 
H- poae*e reaoing .areoa was reaao ,„ Oe open. , he preda I an*o I - 
m Figure 22 (SEQ ID NO:8). sequence shown 
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The CH14-2a16-1 gene was further characterized by obtaining additional sequence 
information. A X.-GT10 cDNA library from the breast cancer cell line BT474 (Example 2) was 
screened using the initial cDNA insert, and two clones were identified: one with a 1.6 kb insert, and 
the other with a 2.5 kb insert. The identified clones were subcloned into plasmid vector pCRII. The 
5 1 .6 kb insert was sequenced by using T7 and Sp6 primers for regions flanking the cDNA inserts as 
initial sequencing primers. Sequencing continued by walking along the region of interest by standard 
techniques, using sequencing primers based on data already obtained. Primers used are those 
designated 1-11 in Figure 18. 

A third clone (designated pCH 14-800) overlapping on the 5' end (Figure 6) was obtained 
10 using CLONTECH Marathon™ cDNA Amplification Kit. Briefly, ONA primers CHl4a, CH14b, CH14c 
and CH14d (Figure 18) were prepared. Polyadenylated RNA from breast cancer cell line MDA453 
was reverse transcribed using 14b primer. After second strand synthesis, adaptor ONA provided in 
the kit was ligated to the double-stranded cDNA. The 5' end cDNA of CH14-2a16-1 was then 
amplified by PCR using primers CH14b (or CH14c) and AP1 (provided in the kit). To increase the 
15 specificity of the PCR products, the first PCR products were PCR reamplified using nested primers 
CH14a (or CH14d) and AP2 (provided in the kit) The PCR products were cloned into pCRII vector 
(Invitrogen) and screened with CH14-2a16-1 probe. 

By sequencing pCH14-1.6 and pCH14-800. a nucleic acid sequence of 2021 base pairs 
between the 5' end and the poly-A tail of CH14-2a16-1 has been determined. The ONA sequence is 
20 shown in Figure 19 (SEQ. ID NO:26). The longest open reading frame is in frame 1 from base 1 to 
792, and codes for 263 amino acids before the stop codon. The corresponding amino acid sequence 
of this frame is shown in the upper panel of Figure 20 (SEQ. ID NO:27). The partial sequence 
predicted for the translated protein is shown in the lower panel of Figure 20 (SEQ. ID NO:28). The 2. 1 
kb clone has not been sequenced, but is believed to consist about the same region of the 
25 CH14-2a16-1 cDNA as pCH14-1.6 and pCH14-800 combined 

A GENINFO® BLAST search of nucleotide and peptide sequence databases was performed 
through the National Center for Biotechnology Information on March 26, 1996. Short segments of 
homology with other reported human sequences were found at the nucleotide level (<500 base pairs), 
but none with any ascribed function in the respective identifier. At the amino acid level, the sequence 
was found to share homologies within the first 106 residues with an RNA binding protein from 
Saccharomyces cerevisiae with the designation NAB2. NAB2 is one of the major proteins associated 
with nuclear polyadenylated RNA in yeast cells, as detected by UV light-induced cross-linking and 
immunofluorescence. NAB2 is strongly and specifically associated with nuclear poJy(A)«- RNA in vivo. 

Gene knock-out experiments have shown that this protein is essential to yeast cell survival 
(Anderson et al ). Accordingly, the protein encoded by CH14-2a16-1 is suspected of having DNA or 
RNA binding activity. 

A fourth clone (pCH14-1.3) has been obtained that overlaps the pCH14-800 clone at the 5' 
end (Figure 6). The method of isolation was similar to that for pCH 14-800. using primers based on 
the pCH14-800 sequence. Partial sequence data for pCH14-1.3 has been obtained by one- 
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directional sequencing from the 5* and 3' ends of the pCH14-1.3 clone. Figure 21 shows the 
nucleotide sequence of the sequence of the 5' end (SEQ IO NO:29) and the amino acid translation of 
the likely open reading frame (SEQ. ID NO:30); the nucleotide sequence of the 3' end (SEQ. IO 
NO:31) and the likely open reading frame (SEQ. ID NO:32). This data is confirmed and additional 
sequence between SEQ. ID NOS.29 and 31 is obtained by fully sequencing both strands of pCH14- 
1.3. Once compiled, the sequence data from pCH14-1.3. pCH14-800 and pCH14-1.6 may be shorter 
than the apparent size of mRNA observed in Northern analysis (Table 1). If necessary, further 
sequence data at the 5' end is deduced by obtaining additional cloned cDNA according to approaches 
described in this Example or Example 4. 

Figure 25 is a listing of additional cDNA sequence obtained for CH14-2a16-1. comprising 
approximately 1934 base pairs 5' from the sequence of Figure 19. The corresponding amino acid 
translation is shown in the upper panel of Figure 26. The additional sequence data was obtained by 
rescuing and amplifying further fragments of CH14-2a16-1 cDNA. Nested primers were designed 
-100 base pairs downstream from the 5' end of the known sequence. The primers were used in a 
15 nested amplification assay using AP1 and AP2. using the CLONTECH Marathon™ cDNA 
Amplification Kit as described above. The template was a Marathon™ ready cDNA preparation from 
human testes, also supplied by CLONTECH. 

The nucleotide sequence shown in Figure 25 is closed at the the 5' end. The lower panel of 
Figure 26 shows what is predicted to be the sequence of the gene product, beginning at the first 
methionine residue. The nucleotide sequence shown contains a point difference at the position 
indicated by the underlining in Figure 25. A base determined to be A from the previously obtained 
polynucleotide fragment was a G in the one used in this part of the experiment. This corresponds to a 
change from E (glutamic acid) to G (glycine) in the protein sequence, at the position underlined in 
Figure 26. This may represent a natural allelic variation. 

A CH14-2a16-1 cloned insert has been used to probe the level of relative expression in 
polyadenylated RNA from a panel of tissue sources obtained from CLONTECH, as in Example 4. 
The relative CH14-2a16-1 expression observed at the mRNA level is shown in Table 11. 
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I TABLE 11: Northern blot analysis 


| Tissue 


CH14-2a16-1 rhRNA 




II heart 


+ 


brain 


+ 


placenta 




lung 


+ 


! liver 




skeletal muscle 


+ | 


kidney 


+/- 


pancreas 




spleen 




thymus 




prostate 




testis 


+ +++ I 


ovary 




small intestine 




colon 




peripheral blood 






•'::++- 


Very high 

High 

Medium 








Low . 
Very low 





10 



15 



CH14-2a16-1 mRNA was particularly high in testis. The .eve. of expression in breast cancer 
ce . fines is a.so quite high, sine the Northern analysis performed on these .ines was conducted on 
tote, ce,,u,ar RNA. U is „ kely that the CH14-2a16-1 gene is invo.ved in a biofcgica, process that is 
«y P K*. to the tissue types showing medium to high .evels of expression, which may refcte to increased 
tissue growth or metabolism. 

Five motifs corresponding to a zinc finger protein have been found in the CH14-2a16 1 
nucleotide sequence. Further zinc finger motifs may be present in CH14-2a16-1 in the upstream 
Erection. Z,nc finger motifs are present, for example, in RNA polymerases .. ... and ... from S. 
comae., and are rented to the zinc knucWe family of RNA/ssDNA-binding proteins found in the H.V 
«*« psid protein The actua. sequence observed in each of the five zinc finger motifs of 
i4-2a16-1 is: 

£v^Xaa) s -Cy^Xaa)Hi3£a-(Xaa) J -ais or (SEQ. ID NO:36) 
Cy^XaaJ^^ys^Xaa) s-Cys-fXaaJHdiS (SEQ. ID NO:39) 
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w«=h is h F*. 20 b y «^ Tte „ Wenacal „ , he ? ^ fe 

.** mate up „ RNA/ssONA H, rsgi o„ (AM<! , S o„ « „. „ ' 

KpRNA * mRNA. W e e«p M o, mRHA .» * nudeus , 0 me „ « 

•»*«. Tto role in to, may be Cose* innpfated * « graiM „ „ 

manifest in tumor cells. y as 

Example 8: Identification of other cancer-associated genes 

cDNA fragments corresponding to add,tiona. cancer-associated genes are obtained by 
apply-ng the techniques of Examples 1 & 2 with appropriate adaptations. As before cancer ceH 
are selected for use in differentia, disp,ay of RNA. based on wh et her they share a duplicated 
chromosomal region according to Table 12: 



TABLE 12 Cancer cel. linos sharing duplicated chromosomal reolons ~™~] 


Chromosomal 
| location 




I 1p22-32 
I 1P22 
I 1p32-33 

1q21-22 
1q24 
1q31 
1q32 
1q 


small cell (Levin 1 994) ~ — 

bladder (Kallioniemi 1995) 

^T^o 3 ( Stei,en ^beD. breast (Ried 1995) 
small cell lung (Ried 1 994) 

sarcoma (Forus 1995a & b); breast (Muleris 1994a) 

small cell (Levin 1994) 

bladder (Kallioniemi 1995) 

glioma (Muleris 1994b; Schrock) 

head and neck (Speicher 1995), breast (Muleris 1994a) 


| 2p23 
A 2p24-25 
I 2 
2q 
2q33-36 


small cell lung (Ried 1 994) — 

small cell lung (Levin 1994) 

head and neck (Speicher 1995) I 
head and neck (Speicher 1995) 
head and neck (Speicher 1995) 


3p22-24 
3q24-26 
3q25-26 


bladder (Voorter). small cell (Levin 1994) ~ ' 

bladder (Kallioniemi 1995), glioma (Kim), osteosarcoma (Tarkkanen) 
ovarian (Iwabuchi) 
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TABLE 12: Cancer cell lines sharing duplicated chromosomal regions 


Chromosomal 
location 


Cancer typo & references 


3q26-term 


head and neck (Speicher 1995) 


3q 


small cell lung (Levin 1995; Rend 1994); head and neck (Speicher 1995) 


4q12 


glioma (Schrock) 


5p 


small cell lung (Levin 1994 & 1995; Ried 1994) 


5p15.1 


glioma (Muleris 1994b) 


6p 


osteosarcoma (Forus 1995a); breast (Ried 1995) 


6p21-term 


melanoma (Speicher) 


7p 


glioma (Schliegel 1994 & 1996; may be EGFR) 


7p1M2 


glioma (Muleris 1994b; Schrock), small cell lung (Ried 1994) 


7q21-32 


glioma (Kim; Muleris 1994b; Schrock) 


7q21-22 


head and neck (Speicher), glioma (Schrock) 


7q33-term 


head and neck (Speicher 1995) 




colon (Schlegel 1995); glioma (Kim), head and neck (Speicher); 
prostate (Visakorpi) 


8q 


small cell lung (Ried 1 994) 


8q21 


bladder (Kailioniemi 1995) 


oq^** 


myeloid leukemia (Mohamed) j 


8q22-24 


guoma (i\im, Mulens 1994b), breast (Mulens 1994a) 


8q24-25 


email roll (\ a\/in 1QQ/1' DIaW 4ftn^<\- /i i, i _ - * r\r^ a \ 

brnaii ueu u^evin latw, Kiea 1994), breast (Mulens 1994a) 


8q23-term 


bejiuunid ^rumb lyaoaj, meianoma (bpeicner) 


8q24 


ovarian ftwahur-hrt 

waiiaii yiwciuvlv.ii H/ 


8q 


breast (Ried 1995' isota* Mutant 1QQ4^\ cm^u ii mn /i **w; n -innA o mnc\ r> U 

V* »'^-**-« iwia, iviuiuiio lUaMd;, otTloli CGII lUng (L6Vin 1994 & 1995) B- 

cell leukemias (Bentz 1994a), myeloid leukemia (Bentz 1994b), glioma (Schlegel) 
head and neck (Speicher 1995), prostate (Cher, Visakorpi) 


9 


head and neck (Speicher) 


S 9D 




9p2 


glioma (Muleris 1994b) 


9p13 


breast (Muleris 1994a) 


10p 


head and neck (Speicher 1995) | 


10p13-14 


bladder (Voorter) | 


10q22 


breast (Muleris 1 994a) | 


11q13 


head and neck (Speicher 1995), breast (Muleris 1 994a) | 
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TABLE 12; Cancer cell lines sharing duplicated chromosomal regions 



Chromosomal 
location 



13 
13q 
13q21-34 
13q32-term 



14q 



15q26 



16 
16p 
16p11.2 



17 

17p1M2 

17q 
17q21.1 
17q22-23 
17q22-24 



Cancer type & references 

B-celi leukemias (Bentz 1995a) 

head and neck (Speicher 1995), glioma (Schrock) 

glioma (Schlegel 1994) 

bladder (Voorter), osteosarcoma (Tarkkanen), liposarcoma (Suijkerbuijk) 
liposarcoma (Suijkerbuijk) 



colon (Schlegel 1995) 

breast (Ried 1995), head and neck (Speicher 1995) 
bladder (Kallioniemi 1995) 

head and neck (Speicher 1995), small cell lung (Ried 1 994) 



head and neck (Speicher 1995) 



breast (Muleris 1994a) 



head and neck (Speicher 1995) 
breast (Ried 1995) 
breast (Muleris 1994a) 



19q13.1 



20p 
20q 
20q13.3 



22q 
22q1M3 



X 
Xq 
Xq24 
Xq1M3 



head and neck (Speicher 1 995) 
osteosarcoma (Forus 1995a; Tarkkanen) 
breast (Ried 1995), small cell lung (Ried 1994) 
breast (Muleris 1994a) 
bladder (Voorter), breast (Muleris 1994a) 
breast (Kallion iemi 1994) 
bladder (Voorter) 
small cell lung (Ried 1994) 



head and neck (Speicher 1995) 
ovarian (Iwabuchi), colon (Schlegel 1995), breast (Isola; Tanner) 
breast (Muleris 1994a), Kallioniemi ( 1 994) 



head and neck (Speicher 1995) 
bladder (Voorter), glioma (Schrock) 



prostate (Visakorpi) 
small cell lung (Levin 1995) 
small cell (Levin 1994) 

prostate (Visakorpi), osteosarcoma (Tarkkanen) 



Control RNA 
experiment Normal 
neoplastic cells in the 
5 cDNA corresponding 



J 



■s prepared from normal tissues to match that of the cancer cells in the 
t'ssue is obtained from autopsy, biopsy, or surgical resection. Absence of 
control tissue is confirmed, if necessary, by standard histological techniques 
to RNA that is overabundant in cancer cells and duplicated in a proportion of 
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the same cells is characterized further, as in Examples 3-7. Additional cDNA comprising an entire 
protein-product encoding region is rescued or selected according to standard molecular biology 
techniques as described elsewhere in this disclosure. 
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SECL ID NO 




Description 


Type 


: — 


31 




0.3 kb nucleotide sequence 


ssDNA 


Figure 21 


32 




translation 


amino add 


Fiaure 21 


33 




3.5 kb nucleotide sequence 


dsDNA 


Figure 23 


34 




translation 


amino add 


Figure 24 


35 


CHl4-2a16-1 


2.0 kb nucleotide sequence 


dsDNA 


Figure 25 


36 




translation 


amino add 


Figure 26 


.37 




protein 


amino add 


Figure 26 


38 4 39 


Motif 


Zinc-finger binding domain 


dsDNA 


text 


40-43 


Primers 




ssDNA 


text 


44&up 


Primers 




ssDNA 


Figures 7. I 
10, 15. 18 | 
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SEQ 10 N 0:9: 
TTTTTTTTTT TCC 

SEQ 10 NO: 10: 
TTTTTTTTTT TAC 

SEQ 10 NO: 11: 
CMTCGCCGT 

SEQ 10 NO: 12: 
TCGGCGATAG 

SEQ 10 NO: 13: 
CAGCACCCAC 

SEQ 10 NO: 14: 
AGCCAGCGAA 
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10 



10 
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Claims 

What is claimed as the invention is: 
1. 



An iscated polynuc.eo.ide comprising a .inear sequence of at .east 10 nuc.eo.des identica. to 
a ''"-3^ contained in a po.ynudeo.de se.ected from tne group con.st.ng of CH8- 
2a13-1.CH13-2a12-1,CH14-2a16-1.andCHl-9a11-2. 

2. An isolated po.ynuc.eo.de comprising a linear sequence of at .east 40 consecutrve 
nucteo ,des at .east 90% *en.ca, to a .inear sequence conned in a sequence se.ected 
from the group cons,s.ing of SEQ. ID N0.15. SEQ. ID N0.18. SEQ .0 N021 SEQ ID 
NO: 2 3 SEO. .0 NO:2S. SEO. ID NO,, SEQ. ,D NO:31.. SEO. ,D ,0:33. and SEQ ID 
NO.35; but not in any of SEQ. IO NOS: 1 . 3, 5. and 7. 

3. The iso.ated polynucleotide of claim 2. comprising a .inear sequence of at .east 100 
consecutive nucleotides at .east 90*/, identica. to a seguence contained in the se.ected 
sequence. 

4. The isoiated po.ynuc.eo,ide of claim 2. comprising a .inear sequence of at .east 40 
consecutive nuclides at .east 95% identJca. to a sequence contained in the se.ected 
sequence. 

5. An isofcted pofynuc.eot.de comprising a „near sequence of at .east 40 consecutive 

of SEQ. .0 NO.15, SEQ. ,D NQ:1, SEQ ,D NO:21. SEQ. ,D NO:23. SEQ. ,D NO 26 SEQ 
ID NO:29, SEQ. ID NO:31,. SEQ ID NO 33 and SEO .n Mn k , 
H^c „ ♦ „ u , NO:35: under ^itions where it 

does not hybndse with SEQ. ,D NOS: 1. 3, 5. 7. or any other DNA from a human ce„. 

6. The iso.ated po.ynucteo.de of Cairn 5. wherein the .inear sequence is at .east 100 
J0 consecutive nucleotides 

7. An isolated po^eotde co^g a Ma , leasM0 ^ 

nrrr an rna a ***** «-" *• ~ «—*• - <*° ,o 

NWS. SEQ. ,D N0:18 . SEQ. ,0 NO*, SEQ. ,0 NO:23. SEQ. ,0 NQ26. SEQ ,0 No 29 

N03V SE ° 10 "** - S » » «* — colons ^ „ 
SEQ. ,0 NOS: , 3. 5. 7. o, any o<U„ RNA ,ro m , nu ™„ c « 



35 
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8. The isolated polynucleotide of claim 7. wherein the linear sequence is at least 100 
consecutive nucleotides 

9. The isolated polynucleotide of any of claims 2-8. wherein said linear sequence is contained in 
5 a duplicated gene or overabundant RNA in cancerous cells. 

10. The isolated polynucleotide of any of claims 2-8, which is a CH13-2a12-1 polynucleotide, and 
is contained in an encoding region for a protein or RNA molecule that controls cell 
proliferation. 

10 

11. The isolated polynucleotide of any of claims 2-8. which is a CH 14-2a 16-1 polynucleotide, and 
is contained in an encoding region for a protein with DNA or RNA binding activity. 

12. The isolated polynucleotide of any of claims 2-8. present in a recombinant plasmid deposited 
15 under ATCC Accession No. 98074 

13. The isolated polynucleotide of any of claims 2-8, present in a recombinant phage deposited 
under ATCC Accession No. 97595. 



20 



25 



30 



35 



14. The isolated polynucleotide of any of claims 2-8, present tn the XBCBT474 cONA library 
deposited under ATCC Accession No. 97594. 

15. An isolated polynucleotide comprising a linear sequence of polynucleotides essentially 
identical to a sequence selected from the group consisting of SEQ. ID NO: 15, SEQ. ID NO 
18. SEQ. ID NO:21, SEQ. ID NO:23, SEQ. ID NO:26. SEQ. 10 N0.29. SEQ. ID NO:31. SEQ. 
ID NO:33, and SEQ. ID NO:35. 

16. An isolated polypeptide comprising a linear sequence of at least 5 amino acid residues 
identical to a sequence encoded by a polynucleotide selected from the group consisting of 
CH 1 -9a 1 1 -2. CH8-2a1 3-1 . CH 1 3-2a 1 2-1 , and CH 1 4-2a 16-1 . 

17. An isolated polypeptide comprising a linear sequence of at least 5 consecutive amino acids 
identical to a linear sequence contained in a sequence selected from the group consisting of 
SEQ. ID NO:17. SEQ. ID NO:20, SEQ. ID NO:22. SEQ. 10 NO:24, SEQ. ID NO:28. SEQ. ID 
NO.30. SEQ. 10 NO:32. SEQ. ID NO.34. and SEQ. ID NO:37; but not in any of SEQ. ID 
NOS: 2, 4. 6. and 8. 

18. The isolated polypeptide of claim 1 7. comprising a linear sequence of at least 1 5 consecutive 
amino acids at least 90% identical to a linear sequence contained in the selected sequence. 



-76- 



WO 97/38085 



PCI7US97/05930 



19. The sowed polypeptide of c.ai m 17 or 18. wherein said ..near sequence is encoded in a 
duplicated gene or overabundant RNA in cancerous cells. 

20. The isolated polypeptide of claim 17 or 18. which is overexposed in cancerous cells. 

21. The isolated polypeptide of claim 17 or 18. wherein the polynucleotide selected from said 
group is a CH 1 -9a1 1 -2 polynucleotide, and the po fy peptide is a transmembrane polypeptide 

22. An isolated po.ypep.de comprising a linear sequence of amino acids essential* identical to a 
sequence selected from the group consisting of SEQ. 10 NO.17, SEQ ID NO 20 SEQ in 
NO:2, SEO. ,0 ,0:24. SEO. ,0 ,0:28. SEO. ,0 NO:30, SEQ. ID NO:32. SEQ C NO * 
and SEQ. ID NO:37; but not in any of SEQ. ID NOS. 2. 4. 6. and 8. 

23 2^JT~* an encodm9 — for the "~ - - - 



24. A monoclonal or isolated polyclonal antibody specific for the polypeptide of claim 22. 

25. A method of detecting gene duplication in cancerous cells, comprising the steps of 

a) reacting DMA contained in a clinical sample with a reagent comprising the 

polynucleotide of claims 2-8, said clinical sample having been obtained from an 

individual suspected of having cancerous cells; and 
b, comparing the amount of any complexes formed between the reagent and the DNA in 

»e Cinica, sampte with the amount of any comp,exes formed between the reagent and 

ONA in a control sample. 



26. 



A method of detecting overabundance of RNA in cancerous ceHs, comprising the steps of 
a) reacting RNA contained in a Cinica, sample with a reagent comprising the 

polynucleotide of Cairn 2-8. said Cinica, samp,e having been obtained from an individual 

suspected of having cancerous cells; and 
.» compa-u* M amoun, of an, comply, ^ „, ^ ^ ^ 

tte*,ca, sa^a m * amognl „ ^ ^ ^ 

RNA in a control sample. 



WO 97/38085 



PCT/US97/05930 



27. A method of determining gene duplication or overabundance of RNA in cancerous cells, 
comprising the steps of: 

a) amplifying DNA or RNA in a clinical sample with a primer comprising the polynucleotide 
5 of c,aim M 10 y ie,d an am P ,ifie d polynucleotide, said clinical sample having been 

obtained from an individual suspected of having cancerous cells, and 

b) comparing the amount of polynucleotide amplified from the ONA or RNA with the 
amount of polynucleotide amplified from DNA or RNA from a control sample. 

10 28. A method of screening for cancer associated with a gene duplication in an individual, 
comprising the steps of: 

a) determining gene duplication in cells from the individual according to the method of claim 
25; and 

b) correlating any gene duplication determined in step a) with an .ncreased risk for the 



15 



cancer. 



29. A method of screening for cancer associated with overexpression of RNA in an individual, 
comprising the steps of: 

a) determining overexpression of RNA in cells from the individual according to the method 
20 of claim 26; and 

b) correlating any RNA overexpression determined in step a) with an increased risk for the 
cancer. 

30. A method of screening for cancer associated with a gene duplication or overexpression of 
25 RNA in an individual, comprising the steps of. 

a) determining gene duplication or overexpression of RNA in cells from the individual 
according to the method of claim 27; and 

b) correlating any gene duplication or overexpression of RNA determined in step a) with an 
increased risk for the cancer. 
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10 



15 



20 



31 The method of any of claims 28-30, which is a screening method for breast 

32. 



cancer. 



33 



34. 



A diagnostic kit for detecting gene duplication or RNA overabundance in cells contained in an 
individual as manifest in a clinical sample, comprising a reagent and a buffer in suitable 
packaging, wherein the reagent comprises the polynucleotide of any of claims 2-8. 

A method for detecting altered protein expression in cancerous cells, comprising the steps of: 

a) reacting a polypeptide contained in a clinical sample with a reagent comprising the 
antibody of claim 24. said clinical sample having been obtained from an individual 
suspected of having cancerous cells; and 

b) comparing the amount of any complexes formed between the reagent and the 
polypeptide in the clinical sample with the amount of any complexes formed belween the 
reagent and a polypeptide in a control sample. 

A diagnostic kit for detecting a polypeptide present in a clinical sample, comprising a reagent 
and a buffer in suitable packaging, wherein the reagent comprises the antibody of claim 24 



35. A host cell genetically altered by the polynucleotide of any of claims 2 to 8 or claim 23. 



36. A method of screening a pharmaceutical candidate, comprising the steps of 

a) separating progeny of the cell of claim 35 into a first group and a second group; 

b) treating the first group of cells with the pharmaceutical candidate; 

c) not treating the second group of cells with the pharmaceutical candidate; and 
25 d > comparing the phenotype of the treated cells with that of the untreated cells. 



37. 



30 

38. 



A pharmaceutical preparation for use in cancer therapy, comprising the polynucleotide of 
claim 2 to 8 or claim 23. said preparation being capable of reducing the pathology of 



cancerous cells. 



A method for treating an individual bearing cancerous cells, comprising administering the 
pharmaceutical preparation of claim 37. 



35 



39. A pharmaceutical preparation for use in cancer therapy, comprising the antibody of claim 24. 
said preparation being capable of reducing the pathology of cancerous cells. 

40. A method for treating an individual bearing cancerous cells, comprising administering the 
pharmaceutical preparation of claim 39. 
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41. A pharmaceutical preparation comprising the potypeptide of claim 17 or 18 in an 
immunogenic form, and a pharmaceutical^ compatible excipient. 

42. A method for treatment of cancer, comprising administration of the pharmaceutical 
preparation of claim 41 . <"maceuticai 



43. 



A method for obtaining cDNA corresponding to a gene that is duplicated or overexpressed 
in cancer, comprising the steps of: 

a) supplying an RNA preparation from control cells; 

b) supplying RNA preparations from at least two different cancer cells 
O drying cDNA corresponding to the RNA preparations of step a) and step b) such that 

drfferent cDNA corresponding to different RNA in each preparation are displayed 



separately; 

15 d ) 



selecting cONA corresponding to RNA that is present in greater abundance in the 
cancer cells of step b) relative to the control cells of step a); 
e) supplying a digested ONA preparation from control cells; 
0 supplying digested DNA preparations from at least two different cancer cells 
9) hybnd^ing the cONA of step d, with the digested DNA preparations of step e) and step 
f); and 



h) further selecting cDNA from the cDNA of step d) corresponding to a gene that is 
duphcated in the cancer cells of step 0 relative to the control cells of step e). 

44. The method of claim 43. wherein the two different cancer ce„s used to supply RNA in step 
b) share a duplicated gene in the same region of a chromosome. 

45. The method of Cairn 43. wherein RNA preparations from at .east three different cancer 
cells are supplied in step b). 

46. The method of claim 43. wherein the three different cancer cells used to supply rna in 
step b) share a duplicated gene in the same region of a chromosome. 

47. The method of claim 43. wherein the control cells of step a) are uncultured. 

48. The method of ctaim 43. further comprising supplying a digested mitochondria. DNA 
preparation; hybridizing the cDNA of step h, with the digested mitochondria. DNA 
preparation; and further selecting cDNA from the cDNA of step h) corresponding ,o genes 
that do not hybridize with the digested mitochondrial DNA preparation. 
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10 



15 



20 



25 



30 



35 



0 
j) 
k) 
I) 



The method of claim 43, further comprising the steps of: 
supplying an RNA preparation from control cells; 
supplying RNA preparations from at least two different cancer cells 
hybridizing the cDNA of step h) with the RNA preparations of step i, and step j, and 
further selecting cDNA from the cONA of step h) corresponding to RNA that is present in 
greater abundance in the cancer cells of step j) relative to the control cells of step i). 

50. The method of claim 49. wherein the gene to which the cONA corresponds is not 

updated ,n at fcast one of the cancer ce„s used to supply the RNA in step j} ^ ve I 
the control cells of step e). J 'eianve to 

51. The method of claim 43. wherein the two different cancer cells used to supply the RNA 
preparations in step b) are breast cancer cells. 

52. The method of claim 43. wherein the two different cancer cells used to supply « he RNA 
preparafons in step b) are from a common type of cancer, wherein the type o 1 c ^ 
selected from the group consisting of lung cancer. g,iob,astoma. pancrea* cancT^ 
cancer, prostate cancer, hepatoma, and myeloma. 

53. The method of claim 43. wherein the two different cancer cells used to suppty » he digested 

Potions in step f) are breast cancer cells. 9 

54. The method of Cairn 43. wherein the two different cancer cells the digested DNA 

selected from the group consisting of lung cancer. g ,iob,astoma. pancreatic cancer colon 
cancer, prostate cancer, hepatoma, and myeloma. 



55. 



A method to***, .ona „ a ^ ^ . ^ 

cancer, comprising the steps of: «*pressea m 

a) supplying an RNA preparation from control cells- 

b) supping RNA preparations from at least two different cancer cells that share a de.eted 
gene in the same region of a chromosome- 

o. AMI cDNA co^poo^ to the RNA Pre( „ rffl to„ s „, s , ep „ and slep „, suc „ ^ 
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56. The method of claim 55. further comprising the steps of. 

e) supplying a digested DNA preparation from control cells; 

f) supplying digested DNA preparations from at least two different cancer cells 

g> hybridizing the cDNA of step d) with the digested DNA preparations of step e) and step 
f); and 

h) further selecting cONA from the cDNA of step d) corresponding to a gene that is deleted 
m the cancer cells of step f) relative to the control cells of step e). 



57. 



58. 



A method for characterizing a gene that is duplicated or has altered expression in cancer 
compnsing obtaining cDNA corresponding to the gene according to the method of any of 
claims 43-56. and then sequencing the cDNA. 

A method of screening a candidate drug for cancer treatment, comprising obtaining cDNA 
corresponding to a gene that is duplicated or has altered expression in cancer according to 
the method of any of claims 43-56. and comparing the effect of the candidate drug on a 
cel. geneticaHy altered with the cDNA with the effect on a cel. not genetically altered with 
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Figure 2 
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Figure 3 
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Figure 4 
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Figure 5 
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Figure 6 
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Figure 7 
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strand (antisense) 
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17. 


CHla 




1063 


18. 


CHlb 




1079 



nee (S'-->3') 

CGG GAG GTT TCA GAT CGA C 

GCG CTG CAA GTA CAA AAT TG 

TCT AAA GTC CAA GAC CAA GG 

CAG AAA TTA TGG TTT CTA CC 

CaG GAA GAG GAG GGA TAA C 
(T) 

AAA CAT ACA CAA TAA ACA C 
TTG GCA GCG ACT GTA TTT G 
CCT GAT TTT ATA GAA GCC CC 

GGG GCT TCT ATA AAA TCA GG 
ATT CAA ATA CAG TTG CTG C 
TTA GTG TTT ATT GTG TAT G 
AGT GTT CAT TTC CAG TGA G 
CTT TGT TCT TGG ACT TTA G 
CCT TGG TCT TGG ACT TTA G 
AAT TTT GTA CTT GCA GCG C 
GTC GAT CTG AAA CCT CCC G 
GTG CCT GTA GCA ACT GGA TGG C 
GTC ATG TTG GTC AGC TGT GCC 
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Figure 8(A) 



1 GAATACATAT ATAAATGGTG TTCAGTTAGA GTTGCTCTTT ATCGGCAGTP 

51 CAGCCGAACT GCTTTGAGTA AAGGAAAAGA TTATCTTGTG TTAGCTCAAC 

101 CACCCTTACT ACTTCCTGCG GAATCAGTAG ATGTTTCAGT ATTGCAACCT 

151 CTGAGTGGAG AATTGGAAAA TACGAATATA GAAAGGGAAG CTGAAACTGT 

2° T^S^l ^AAGTA GTAGTATGCA CCAGGATGAC 

251 ACACTGTAGA TGCAGTTGAA CTTGAACCAA GCCATTCTCA AACTCTTTCT 

301 CAGTCTCTTC TTTTAGATAT TACCCCAGAA ATCAATCCCT tGcSSS 

351 AGAAGTATCT GAGTCTGTTG AATATGAGGC AGGACATATA cStcACCAG 

^1 IS^ 00 ^ AGAGAGTTCT GTTGAGATCG ATAATGAAAC AgSSg 

451 TCTGAGAGCT TTAGTTCTAT AGAGAAACCA TCTATTACCT ATGAAACAAA 

501 TAAAGTTAAT GAGTTAATCG ATAATATTAT AAAAGAAGAT ATGAACTCCA 

551 TGCAAATTTT CACAAAGCTG TCTGAAACAA TAGTGCCACC AATaSS 

601 GCCACTGTAC CCGACAATGA AGATGGGGAA GCCAAAATGA AtIJSSS 

651 CACAGCAAAG CAAACTTTGA TTTCTGTTGT GGATTCTTCT tStTAcS 

7oi aagtaaaaga agaagaacag tctccagaag atccccS? SS 
751 cagaggacag ctacagattt ttatgctgaa ttgcaaaatt ctSSSSS 
801 aggatatgct aatggaaatc ttgtacatgg atcaaaccaa aaSSS? 

851 TATTTATGAG ACTTAATAAT CGTATTAAAG CCTTAGAAGT SS 

901 CTCAGTGGTC GCTATCTGGA GGAGCTTAGC CAAAGGTACC gWcSS 

1 nm ^^ TC CAAAAGGCrr -TCAACAAAAC AATCGTGAAA 

1001 CTTCAAGAAT AGCAGAGGAG CAGGATCAGC GGCAAACTGA AGCCATCcIr 

10S1 TTGCTACAGG CACAGCTGAC CAACATGACA CAGCTTGTTT 

1101 AGCAACAGTA GCAGAATTGA AACGGGAGGT TTCAGATCGA CAAAgSaTC 

1151 TTGTCATATC TTTGGTTCTT TGTGTTGTCT TGGGACTGAT SSS 

1£\ S^™"* GAAATACTTC TCAATTTGAT GGAGATTATA tSSS5c5 

1251 TCCTAAAAGT AATCAGTATC CAAGCCCTAA AAGGTGTTTC •TCTTCCtJtc 

1301 ATGATATGAA TTTGAAAAGA AGAACTTCAT TCCCACTCAT ScSS 

1351 TCTCTACAGT TAACTGGCAA AGAAGTAGAC CCAAATGATT TCTaSSg^ 

1401 AGAACCCCTC AAGTTTTCTC cagaaaagaa gaagaagcgc S^SS 

1451 aaattgaaaa aattgagacc ataaagcctc aagaaccaS £S£2SS 
gccaatcgcg acataaaagg aagaaagccc tttacgaacc a^gaSSS 
ttctaatatg ggagaagttt atcacicttc ttataaaggt cctccatSS 

AAGGAAGCTC AGAAACTTCA TCACAGTCAG AAGAGTCCTA TTTTOlS? 
ATTTCAGCTT GCACAAGTCT GTGCAATGGA CAGTCTCAAA AcSSS 
TGAGAAGAGG GCTTTAAAAC GAAGACGATC TAAAGTCCAA GAcSS 
1751 AATTGATAAA AACTCTAATA CAGACTAAGT CGGGATCATT GCCGAGCCTT 

18?} ^™ TAA TCAAAGGAAA CAAAGAGA1C ACCGTCGGaI 

o SSSS SSS JS^™ ATO ^crrr SS5 

C^CTTTTTTG TTCnTCTTCT TTGAAGAACA GTCTGTAGTA TITGAAnr^r 
TTGGGGGAGG GAGAAAATAT TAATGGGAAA GGCATiSa aIJSJSS 
TCTACCTTIT TAAAAAGTAG ATGGGATTCT GCTX^TCTT 
CTACAGTTTT ACAAAGCTCA TCACTTCCTA TAAGGaSS 
TTATAAAGAT GTTTTTTCAC AAGATTAATT ACTGGGACAA AAGtS™ 
GAAGCCCAGT TCCTTAGGTG GGATAGGAAT GAAAGCCTAA 
TTAGCTTTCT TCCTATTTCT TGCACCTTCC CATATTTATG tSSS 
CTArrrATAA 'K^CTCGA AGAGGAGGGA TAACTTTTTC SSSI 
TITCTITrAT AACTTTGTTA GGTTTTTGAA GCTCCAAaS SISSS 
TTCAGGGGGT CTCTCCCTGA AGCTCAGGAG TCTCGATCAG SgtcSS 
2401 GATCCTAAAA ACTTGCCAAC TGGATCTTCG TTTAGCAAAC 

«S ^^T rA ATCGAATm TAAGTCTGTT CTGTTAGGTA gSgSS^ 

2501 CTCTTGTTAT TTTCACTTAT TCAGGCTGGA TTACTTCTTA SSSS 

2551 AACTCAATGA GGAAAAAATC CCTACAGGAT CITITmS SIS^ 



1501 
1551 
1601 
1651 
1701 



1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
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2601 


TATATGCAGA 


2651 


CGATTTCTGA 


2701 


ACTAATCCTC 


2751 


AAGGTTTGGC 


2801 


CATATGCACA 


2851 


AAGTCTAAAG 


2901 


ATGTGCAGAG 


2951 


GTCTTTAAAA 


3001 


AGTATGATTC 


3051 


GGGAGAGAAT 


3101 


AGCACTTTTA 


3151 


GGGTATTGTT 


3201 


TTGCTGCTTT 


3251 


GTCTGCACAA 


3301 


CCTTGAAAAG 


3351 


ATCTGTGTGT 


3401 


GTGCCATTAG 


3451 


TT 



CA AATl-r riG ACAAATTCAC 
AGGT TTTCT T TAGCTTACAT 
CAAACTTTCA CTCTTTTTAT 
CAATTAGTAC AAGTCTCATG 
GATCCAGTTA GTGAGTTTGT 
AGATTATTAT TCCTTGATGT 
GTAATACATA TGTGATCTCG 
AATAATTCGC AGCAACTGTA 
TACAGTAATG AATGAAAGTG 
TGACCATTTA TTGTTGTGAT 
GTAGTGATAA CTGTTTTTAA 
TCTAATGTCA CTTATTTAAC 
AGGTTAACAG CGTGTTTTAG 
TTAGCTATTC AGAGCAAGAG 
AGGTCCAGAT GAGAGCAGAG 
TGTGGGAAGA GAATTTTCAA 
AAACTGTGAA TTTCCAAATA 



CTTTTAAACA CGACGTTAAC 
TTTAAACATA CACAATAAAC 
TAGTATGAAT ATAAAATTTG 
ATATAATCAC AGCCTGCATA 
CAA GCTT AAT CTAATTGGTT 
TTGCTTTGTA TTGGCTACAA 
ATGTCTCTGT CTITTTTTTT 
TTTGAATAAA ATGATTTCTT 
GAACATGTTT CTTTTTGAAA 
GTTTAAGTTA TAACTTATTG 
ACTTGCCTAA TACCTITCTT 
GCCTTCTTTG TTTCTTTAAG 
AAGATTTAAA TTTCTTTCCT 
GGCCTGATTT TATAGAAGCC 
ATACAGTGAG AAATTATGTG 
TATGTAACTA CGGAGCTCTA 
AATCTGAACA CTTGTCTTTA 
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51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 



i SSSS SSSS asSSf** ESVDVSV ^ 

1 QSLLLDITPE £^3g JSKEK! ^WHSqEs 

SESFSSIEKP SITYETNKVN EiWikS VEIDNETEQK 
ATVPDNEDGE AKMNIADTAK SilswSS ^^3^ SETI VPPINr 
QRTATDFYAE LQNSTDLGYA NGNLW^S ^^F* 0 Sp EDALLRGL 
LSGRYLEELS QRYRKQMEEM I^S^ RIKA LEVNMS 

LLQAQLTNMT QLVStSn/ ^f^^ Q^PTCAIQ 

QRCRNTSQFD GDYISKLPKS i^Jf^ 
SLQLTGKEVD PNDLYIVEPL KFcprSE SSYDDMNLKR RTSFPLMRSK 
ANGDIKGRKP FtoJS GeS IKPEEPLHPI 
ISACTSLCNG QSQ^TKTEKR aSSh £!!f? SSCTS SQSEESYFCG 
HDIIKGNKEI TVGTFGvSv JgS^S ^? LrKTLI QTKSGSLPSL 
LGEGENINGK GIQKiSSf ^Sg^ gT^? 5 ^L-YLKG 
L.RCFFTRLI TGTKVIWKPS SLGGTraW' S^T^ 3 * SLPIRTMVDI 
LFIMPLEEBG .LFLLFDFFY NfIS^ ^fE!!^ 13 CTFPYLCAFC 
DPKNLPTGSL FSKLTGNEHL ESSS^ S SGVWIRQSK 

NSMRKKSLQD LFLQTTDICR otpt^H ^ DGDALVIFTY SGWITSYLVT 
TNPPNFHCFY tSSSSg 255^2 ^f^ 0 ^ • LTF ♦ TYTIN 
KSKEIUP.c LLC1GYKCAE VlX^ SLHTYAQIQL VSLSSLLLV 
SMIVQ* -MKV EHVSfTkg^ ^PF^SP T^^ 10 SNCLIK-FL 
GYCL.CDLFN AFFVCLSCCF RLTaSS ctHJT'**" ^^^TFL 
P-KEVQMRAE IQ-EIM-SVC Co£S5?¥ ^ CLHN * LF RARGPDFIEA 

c cgkrifnm-l rscsairnce fpnksehlsl 



51 
101 
151 



EYIYKWCSVR VALYRQRSRT ALSKrmviu 

LSGELENTNI EREAETWLG DLsSSn mS^^ ESVD VSVLQP 
QSLLLDITPE INPLPKIEVS EsSSSS? £™nVDAVE LEPSHSQTLS 

201 f^ SslEKP s ™etS SESS 5 veidneteqk . 

201 ATVPDNEDGE AKMNIADTAK QTLISWdS SETrV PPINT 

251 QRTATDFYAE LQNSTDLGYA SnLVHtS ^^VKEEEQ SPEDALLRGL 

301 LSGRYLEELS QRYPJCQMEEM SSS ^SSf^ RIKA ™1S 

351 LLQAQLTNMT QLVSNLSATV AElSS %E? RUEE QDQRQTCUQ 

401 QRCRNTSQFD GDYISKLPKS nSSct ^^^^CWLGL^ 

«1 SLQLTCKEVD PNDLYIVEPL SspSS ^^^^sFpiMsK 

501 ANGDIKGRKP FTNQRDFSNM GEVySSS SSS?^ IKPE EPLHPI 

551 ISACTSLCNG QSOKTKTFVn »r«r™f PPSEGSSETS SQSEESYFCG 

60i hdiikgnS SS Sf^ 1070 DQGKLIK ^ r otkSSS 
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i* ( /°" so) sequence (S'-->3') 

1. Pch8-sp6-lf 369 

2 • pch8-sp 6 -2f 677 tCT GAT CTT CTG ^ ^ q 



3. pch8-lfa 123 8 



(CTC) 

TCT GAA CTG CCT GAG AGA C 



4. pch8-2f T/ic-y 

5. pch8-3f ^745 JCJ ~T GGG AGC ATT ACA AG 

1/45 TCA TCA AAT GAT CAG AAC C 

6. pch8-4f iqqcr „ mm 

7- pch8-5f 2277 A JT CTG GAG ACT TGG TAT CC 

2277 GGA ATA AGG AAA GAG CTT G 
8. pch8-6f icr q 

9- P ch8-5rb 2A4Q J£ C ACT CAT ATT CCA ATA CC 

2849 CCT GAG AGA CAG AAC TGT TC 

10.pch8-4rb -mqn 

H.pch8-3rb 3?7q GGJ CCC TTC ACT TCC TTA C 

3 370 GGC CAC CAC TTG TCC TGG G 

12. pch8-2rb 3517 fcm „ 

13. pch8-lrb 3970 S£ AGT GCT CTA ACT G 

jy?0 GTA CTG CCT CTC TTA AAT G 

14 8 p^r2r (antiSenSe, 3 617 Se ^- (5, " >3 '> 

J 6 17 CAG TTA CAG CAC TGT TCT G 

3 360 CCC AGG AC A AGT GGT GGC C 

38^9 ?T A AGG ^ TGA AGG GTC C 

3849 GAA CAG TTC TGT CTC TCA GG 



14 . pch8 

15. pch8 


-2r 
-3r 


16 . pch8 

17 . pch8 


-4r 
-5r 


18. pch8- 

19. pch8- 


-6r 
-5fb 


20. pch8- 

21. pch8- 


■4fb 
•3fb 


22 . pch8- 

23. pch8- 


2fb 
lfb 


24 . pch8- 

25. pch8- 


fb-lf 
fb-2f 


26. CH8-3670 

27. CH8a 


28.CH8b 





3563 CTT GGG TAT TGG AAT ATG AG 

2277 CAA GCT CTT TCC TTA TTC C 

1999 ATA GGA TAC CAA CTC TCC AG 

1746 TGG TTC TGA TCA TTT GAT G 

Wtl 22 GTA ATG CTC CCA TTT GG 

1238 GTC TCT CAG GCA GTT CAG A 

941 GTA GAG AAT CAC GTA CAG C 

6 12 CAA TGA CCA GTA GCA TAA C 

387 1 CAT TTA AGA GAG GCA G 

387 CCT GTA GCT CTG GCT TaS CAT CC 

510 CCC CTT CAT TGA GAT CAT CTA G 
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SS SSSf T ^AATCG AGGAGCCAAT 

101 GCCAGTTAAT CA^SS SSJSSS iSS^ 

151 CAATGTTCGA CTTTCTAGCC rlrT^^H AGAG TAGTCG GTCOGCCTCA 

201 AGGATTGTTT SSSS £££££ A «*£S 

251 TGAGTTTATT CCTGCTGTGT TCAGGtSU ^T^ 01 ^ TGAGACTCTC 

301 AATATGGAGA TATCATATTT gSSSSS ^■ < ^ GAGCT ^TCAACAGA 

35i TCGGAAAGCA AACTCrft^ ^HT^ 001 ^ ATTTTAAGGG TCCAGAAVta 

joi ArmxnSS mcaaSS ZSSS? SS5* tc 

451 AAAGTGTACA TAAATATATT PTA^^^ CAGATTTTAT TTAGCATITC 

501 AATCAAGGGG TTTATATtcI S2£^* ACAGA ™TCT AGATCATCTC 

551 AGATGGAAAA CAACTTTT^ ^ AAACCTTA GAAACTCTCC TTCTCAaSa 

|0i tactSS SJSSE SS ^cttatat SS 

651 GTITCTTACT ACCGATACAr S^S* 0 ^TCAGAGA GAGGATCOTO 

7? GGACGATATT SS SSSSS aSS^S A ™S 

751 GTCCCAAAAG ACCATCCAAC TM»oS£ AGGTTArrCT AGCCAACCAG 

801 ATCAACGAAT CCTTCATCAG JSSSSS <»GAGTCCCT 

851 TATTTACAAC CAGGTCTrar- G^GACTGA GATCTGATTA 

901 CCCTGGCAAA SSSj £SS5£ T ^^SS 

951 TCCATCCTTC ACACcrhr^ a^T^ G ^TCTCTA CTTTGAGTrr 

os 0 ! 1 ss*** 1, ss ss sssss 

TAGTAGATGC TTGGGAACCT TACAAAGCT^ ^1^5° ACAG TTAATC 

HOI ACCCTGGACC TTTCAAATX3T CAGAct^ CAAAAACTGC TTTAAATAAT 

1151 CAGTGAAAGA GTGCATGCTC ?2^S AGAT ATCCTACTCT 

1201 TAAGGGAGGA GATGGTTCTG GAAGGTTATT 

1251 AGAGACTCCA ATGTTGCCAT CCGATO^ ^ AAA ^ TCT GAACTCCCTC 

13 01 AGCCTGTGAC CCAAACAACA SSSSS ^GCAGACTC 

1351 TAACAGACTC TCGGTACAAT CCCAcS^r JS^** 0 GMXAGATTC 

1401 ACTGCACAAT TTCAGTTTA? AcSSS GCrG ^^T 

Jsof A 5 MAAGCAA accaaatcgg SS SI^ 00 ^TCcrrrc 

Jssi I^ctgagct tgctgatgtc tSSSJ "tccgagcgga 

llof ^S AAAAATC aaaaccttca agcSgS? a^^ 0001 * ^ccagagtc 

1601 attgtcttta aattatcatc atSSJS ^^agagatct caaaacaaat 

1651 AACTCATACA AGCTTTGGAA 5SS 1^°°^ ^CTCTAC 

1701 AATCTGCAAG TATGTCAGTT TcSS^T AA ^ CACCA GTTCGAATCC 

1751 AATGATCAGA ACCATTAACA SaSSI T^TCATCA 

"01 TCGTTGGGGA CCTTTCTTTC GctJSS^ GGTTCTGATC ACAATCCAGA 

"51 ATCATCCAAG AAAGCATa£ ^TIGACAG TITCACATrr 

[HI AGCTACCTTC C^SS£ cS££cS ^CATCGTTA SS^c 

"SI TTAATCAGGC AAATCGCCCC rT^S^ ^TCTCCCC CTTCTTCGTA 

2001 GGAGAGTTGG TAToSS S^*^ ^^^ACA GTACTATTCT 

2ioi 2^™-^ SS SS&KI S^ 0 *** tcSSSS 

2101 TTGAAGTCCC TACCCGCCTG PArT^^ GCITCAGAC C CACGACATTA 

215 CTAGGCCCAC GaScSS £££££ S^™*** ^TATt^E 

2201 TACTGAAGGC ATCTTAATCA SjcSJ SS^™ ^^Tr 

|251 TGGATCCAAA GCAGTTGCTG GAAGaSSa IFF 00 ^^ 000 ATCATCAAGG 

2301 CGOGTTGCCT TTCCCCTCCA SgStS^ TAAGGAAAGA GCTTCTCAAG 

2351 GCCAAGTGAA TTGATCCCCA aSS^ ATATTCAACC CTCGAGCCAA 

2401 GATTCCATCG TTCTTTTGAA JSSctS aSS^ 

s= sssgg s~ L~ ss 

2551 ^cc^ ^ g^gM JgJgJCg J^AGCJ 
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2601 
2651 
2701 
2751 
2801 
2851 
2901 
2951 



SS SS £S£3 

s= ss jsS iH esse 

GATTGTAAAA GAGTTACAGA ATtcScAg' TCT GCrrTAT 
TGAGAGACAG AACIXmSS GACaSSJ Ia a^?" 3 ^^TCC 
AGTCCCCTAA AAAGTATTCT OgSSSS ^^ CTCAT GAATGCTCTC 
CATTGCCAAA ACACAGAAGA T^SS £1^™ A ™^CGC 
3001 AGGTTGGGCA GATGCAGATT CTGAoSaS Sa^?^ 

3051 TATTCTTCTC GGTTTCATTT Taaa^^ AGATTGCCAA TGAATTAAAT 

3101 CAATAAGGCT CTcSS TCGAG ^ 

3151 TTCCTTACCC CAAAGAAGAT MCAcSS SS^S GACCCTTCAC 

3201 CTGGAGGCAG CTGGCATTCA CAACCCACTr 1*™?^^ CACAGCCTAT 

3251 AAAGCGCTTA CCCTArTITC CAA^SS JSS^ ACATAACAAC 

3301 AGTTCCCAAA ACTTCAATAC AACAAaW SS^T^ ^'TCGCrc 

3351 CCGACCGACC CGGTTGATTC nrrAr^S Z^ GAATCGT CTCCCGAAAA 

3401 GCTCAAGCAG TTCCATTCCC Str^T GTCCTG GGAC TGCTCACTCT 

3451 GCCAGTTTAT CTgSSS ££AGCTCCTG GCGCTCATTC 

3501 GAAATTCCTG CAGATGTTG? SS^T 2^°°^ GAAGATACCT 

3551 TCGGTACACA AAGCTACOCA GgSS^ ^ TTCCTCG AGGATTATCT 

3601 TCATTTTTGA 1GAGTTCAGA TGAAGCACAT GTGCCTAATT 

3651 CAATGGAAGG ATTGTCCTTA cS^^ T AACTG ' rr T^ CCTACTTCTT 

3701 GATGAAAAGA £££££ ScSSS ™™AA 

3751 TGGGAAACAT CAGACGTTAT GACTaa™^ Z^™^ CTGTCTATTA 

3801 TATAACTCAT ATTOtSaaa TATCTCATGG CATTAGTTAA 

3851 AAAAGCAGAA CACArSJ? ACATGCAATT TATATCAGAT 

3901 GTTATCTATA StcSSS JSS^ otaaatoct GAATGTAACT 

3951 CTCCAGATTT TcSJSS g AAAGMCT ATTTGTGCAA 
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Figure 12( A) 



APWRGPADRF FNGGANLSAH LVSSNNIQTP ALRPVNHPQC PGTE'SVRLT 
m^FLAENML CGQAILRIVS CGNAIIAELL RLSEFIPAVF RLKDRADQQK 
YGDIIFDFSY FKGPELWESK LDAKPELQDL DEEFRENNIE IVTRFySfo 
SVHKYIVDLN RYLDDUJEGV YIQQTLETVL LNEDGKQLLC EaISS 
LV1DQKIEGE VRERMLVSYY RYSAARSSAD SNMDDICKLL RSTOYsSS 
AKRPSNYPES YFQRVPINES FISMVTGRLR SDDIYNQVSA YPLpSrIS 
LAN^YV ILYFEPSILJi THQAKMREIV DKYFpS^ 
VDAWEPYXAA KTALNJJTLDL SNVREQASRY ATVSERVHAQ VODFLKEGyT 
REEMVLDNIP KLUKXRDCN VAIRWUfliTr ADSACDpS SEmE 
TDSRYNPRIL FQLLLDTAQF EFILKEMFKQ MLSEXffmS SySeSeS 
TELADVFSGV KPLTRVEKNE NLQAWFREIS KQILsSraD 
LIQALEEVOE FHQLESNLQV CQFLAOTRKF LHQKIRTINI k2£S? 
VGDLSFAW3L IDSFTSIMQE SIRVNPSM^ KLRATFLKLA sSdSSSt 
NQAjmPDLLS VSQYYSGELV SWRKVLQII PESMFTsS 
EVPTRLDKDK LRDYAQLGPR YEVAKLTHA1 SIFTEGILMM KtSSttkv 
DPKQLLEDGI RKELVKRVAF ALHRGLIFNP RAKpSS £££££ 

QEEVSRIINY NVEQECKNFL 
YQSTHIPIPK FTPVDESVTF IGROCREILR ITDPKMTCHI DDLNTWVTTMTf 
THQEVTSSRL FSEIQTTLCT PGLWGLDRLL CBhSn fSSSS? 
RDRTVQDTLK TLMNAVSPLK SIVANSNKIY FSa£S£ SSZ 
VOQMQILRQQ lANEI^SCR FDSKHLAAAL EnSS S^pS 
f™™^ YEITAYLEAA GIHNPLKKIY ITTKRLPYFP IvSSo 
LPKLQYNKNL GMVCRKPTDP VEWPPLVLGL LTLLKQFHSR YTCOLLaS? 
SSSES* ^ IPEIPA DYVRy££S SSaS 

IFDEFRTVL. LFFLLLQWKD CP.IFPPSQM NLKMKRNSVA HTTAFFLSTM 

W SS£« SHGIS « W ' y Cl^HGircNL YQIKAEHIFV SJ^S 
YV.IHLVLCS KELFVQLQIF SKIVLL 



/-/As- 
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Figure 1?{ R) 



sssss sssss SK2E 5SSSS ESS** 

EES = sS EF 2=2 

AKRPSNYPES YFQKVPINES nSS? SNMDDICKLL RSTGYSSQPG 
LANQAAMLYV IlJfEPS^S SSS St^* 
VDAWEPYKAA KTALM/TLDL SNVRkSry aSS^T SmGn ™< 
REEMVLENIP KLLNCLRDCN SuS^S ^f^ 0 V «>FLKEGYL 
TDSRYNPRIL FQLU^QF iff^™** 
TEIADVFSGV KPM^S S^Ss S^SS* "^SERM 
LIQALEEVQE FHQLESNLQV SS STAAGRKTVQ 
VGDLSFAW3L IDSFTSIMQE sSvSSSJ SS^^ 1 KEEVLI ™QI 
NQANRPDLLS VSQYYSGELV SYVRja^T S^E"^ ^^^Rl 
EV^LDKBK LrSv AQLG ^r s ^f^ ™*»>» 

DPKQLLEDGI RKELVKRVAF ALHRtTt^d f^^P^ KITLVG HKV 
FHRSFEYIQD YVNIYGLKIW SfSSp™ RAKPSELMPK LKELGATODG 
YQSTOIPIPK F^S^ SS m » 

RDRTVQDTLK TtMWvSS S^SS^ CFMIVKEL ON FLSMFQKIIL 

vgqmqJlrqq fSSL7 fsaiakt^ki KTAYLEMKK 

PYPKEDOTLL YeSS SSS^t f UiKALLAD IEAHYQDPSL 

OFICSTVEQC TSQKIPEIPA DV^M^ J^OFHSR YTEQLLAXIG 
IFDEFRTVL DWGALLFLE DYVRYTKLPR RVAEAHVPNF 
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E £ 2 £ T S S S S S S s IS 000 * CT - •» 

CCG CGC GCG ACC TCC GGC CCT GCG SI ^ TGG GTG C CG GCT 

gac agg rrc rrr aat gca SI ccc" IS tct S S CCG GCT 

AAT ATA CAG ACA CCA GCT CTG IS Irl S ° CTG GTT TCA 
TGT CCA GGC ACA GAG TAG TCG GTC S 21 21 ^ CC ° 

GCC GAG AAC AAC CTC TGT GGC CAA Gel 2S SI G " G GA ° TTT "A 
GGT AAT GCC ATC ATT GCT GAA CTt" TTG III ^ *" G " TCC T ^ 

GCT GTG TTC AGG TTA AAA GAC 21 US ^ A GAG TTT ATT CCT 

ATC ATA TTT GAT TTC Igc SS £ He ^ SI S! *** W G 
AAA CTG GAT GCT AAG CCA GAG CTA Ca"g St 2£ ^ " A TGG GAA AGC 
GAA AAC AAC ATA GAA ATT GTG Zc HI SI A ^ GAA GAA TTT CGT 
GTA CAT AAA TAT ATT GTA GAC J£ He 21 III "* ** AG * 
GAA GGG GTT TAT ATT CAG CAA Ice TtI «I IS A ^ GAT CTC ^ 
GAT GGA AAA CAA CTT CTA TGT GAA III ^ GTG ^ CTC AAT GAA 

CTA CTG GTC ATT GAC SJ £ ™ CAA IT ™ G " ™ 

CTG GTT TCT TAC TAG CGA TAC ACT GCT £2 ^ T ° AGA GAG AGG ATG 
AAT ATG GAC GAT ATT TGT AAG CtI rZ ^ TCT TCT GCT GAT TCA 

CAA CCA GGT GCC AAA AGA CCA TCC III SI J" ^ ™ T TGT AGC 

AGA GTG CCT ATC AAC GAA S TTC IS IS S£ 2*° AGG TAT ™ CAG 
AGA TCT GAT GAT ATT TAC AAC £g £c S S ™ *" GG ^" GGA CTG 
CAT CGC AGC ACA GCC CTG GCA AAC SI IS o ™ T CCT ^ CC C GAG 
CTC TAC TTT GAG CCT TCC ATC Sc 21 ?* A ™ ° TG TAC GTG ATT 

GAG ATA GTG GAT AAA TAC TTT CCA St S ^ ° CA *** AT ° AGA 

ATG GGG ATC ACA GTT AAT SI gS nS ^" GG GT A ATT AGT ATT TAC 

GCA AAA ACT GCT TTA AAT Hi ™ J™ GAA G CT TAC AAA GCT 

CAG GCA AGC AGA TAT GCT ACT GTC Ict 21 I" ^ GTC AGA GAA 

CAG CAA TTT CTA AAA GAA GGT TAT J£ 2£ i2 J™ GA ^" GCT CAA GTG 
AAT ATC CCA AAG CTT CTG AAC TGC CTG IS nS ^ GTT CTG GAC 

CGA TGG CTG ATG CTT CAT ACA S ZZr SI ^ TG ° AAT GTT °CC ATC 
AAA CGC CTT CGT CAA ATC IS aZ £ IS ^ TGT ^ CCA ^ AAC 
AAT CCC AGG ATC CTC TTC CAG CTG CTC TTA «I ^ ^ TCT GGG TAC 
TTT ATA CTC AAA GAG ATG TTC Sg S IZr S ACT GCA CAA TTT GAG 
AAA TGG GAG CAT TAC AAG AAA GAG GGT Trr T ^ AAG CAA ACC 

GCT GAT GTC TTT TCA GGA GTG AAA CCC CTA Irr ?* ^ ACT GAG CTT 
GAA AAC CTT CAA GCT TGG TTC IgI Sg £Z ^ GTG ° AG AAA AAT 

TTA AAT TAT GAT GAT TCT ACT J£ GCG air lr> *** ^ ATA TCT 
ATA CAA GCT TTG GAA GAG GTT SI G^ S ^ ^ GTA GAA CTG 

CTG CAA GTA TGT CAG TTT CTT GCC GAT IS Si ^ ^ ^ TCC ^ 
ATG ATC AGA ACC ATT AAC ATT AAA GAG S£ TTT CTT ^ CAA 

ATC GTT GGG GAC CTT TCT TTC GCT TGG CAP tty^ ^ ^ AGA ATG CAG 
JC'C ATC ATG CAA GAA AGC ATA AGG GTA AAT CCA SI ^ ^ TO AGA 
CTC AGA GCT ACC TTC CTA AAG CTT Zc tS ~ ^ ^ ACT AAA 

CTT CGT ATT AAT CAG GCA AAT cS CCC I7r ^ ^ GAT GTG C CC CTT 
TAC TAT TCT GGA GAG TTG JJI TCC S£ GTG S ^ AG< ~ GTG TCA CAG 
ATC CCA GAA AGC ATG TTT ACA TCT CTT CTA lap ^ ^ CAG ATC 

ACC CAC GAC ATT ATT GAA GTG CCT 21 2^ AT ° ATA ^ CTT CAG 

AGG GAC TAT GCT CAG CTA cSc cS S Sc rlr ^ GAG AAG CTG 

- GCT ATT TCC ATT TTT ACT GA^^^^ 
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£ £ S 2S S 2£ 22 S SS iS ™ 2? CTG GAA GAT GG * 

CTG ATA TTC AAC CCT CGA GCC AAC ^ n GCC CTG CAT AGG GGA 

™ CAC TTG GGA Ocl ACC aS J£ £ £ SS St TCT TTT T ^ 
ATA CAG GAC TAT GTC AAC ATT TAT GGT CTC 2£ 21 ^ GAA TAC 

GTA TCT CGT ATC ATA AAT TAC AAC rSJ ^ TCG CAG Gft A GAA 

CTA AGA ACG AAG a£ W T^ Sa Ac"c ^ ^ ^ ^ ^ m 

ATT CCA ATA CCC AAG TTT ACC CCT £S i£ i™ ™ ^ TCC ACT ^ 
GGT CGA CTC TGC AGA GAA A~TC CTG S 22 ^ ** ATT 

TGT CAC ATA GAC CAG c£ ^ S ^ S oTr ATG S A^ J™ 
GAA GTG ACC AGC AGC CGC CTC TTC TCA CAA aJJ i *** ACT C AG 
ACC TTT GGT CTA AAT GGC TTA 1^ ^ ^ A ° C ACC TTG GG A 

AAA GAG TTA CAG Sc Sc AC~t a^ CTG TGC ATG ATT GTA 

SAC AGA ACT GTT Sg gIc S i£ S IS ^ CTG AGA 

CCC CTA AAA ACT ATT GTC Hi ° ATG *»* GCT GTC AGT 

ATT CCC na* *™ TCA AAT AAA ATT TAT TTT TCC rrr 

*Z gS C^ S £g "g S ™ ^ CTC GAG G ^ ATA ATG 

AAT TAT tS ^ cIg TT? gIT S aS C^ ^ *" «* ** T GAA "A 
AAT CTC AAT AAG GCT CTC CTA GCA GAC AT-r 2° ^ ^ GCT CTG G AG 
CCT TCA CTT CCT TAC CCC £2 G^ ™ a- ^ ^ TAT CAG GAG 

ACA GCC TAT CTG Sc" ^ ^ ™ TAT GAA ATC 

TAC ATA ACA ACA AAG CGC tS J£ £5 J£ ATT ^ ^ ^ A *^ A 

TTT TTG ATC GCT CAG TTG CCA aaa CTA AAC TTT CTA 

ATG GTC TGC CGA AAA CCQ ACC GAT rrr ™ ^ *** MT CTG GG A 

CTG GGA CTG CTC iJJ Sg CTC aa^ ^ 1X30 CCA CCA CTT GTG 

CAG CTC CTG GCG CTG ATT £c etc If TCC GGG TAC ACC GAG 

TGT ACA AGC CAG AAG ATA CCT GAA ATT CCT GCA GAT ^ ^ ^ ^ 
CTT CTG TTC CTG GAG GAT TAT GTT cS S a^ a^ CTG GGT GCC 

GTT GCT GAA GCA CAT GTG CCT aaZ ^ ! ACA ^ CCC AGG AGG 

GTG CTG TAA CTG TTT TTC CTA CTT CTT ca^ < ^ > ^' ^ GAG " C AGA ACA 
ATC TTC CCA CCA TCA CAA ATG AAT ttt a^ ^ ^ ° AT TCT CCT TAG 
GCT CAT ACA ACT GCA TTT TTT CTG 7cT ATT ^ ^ ^ ^ TCA GTT 
TAT GAG TAA GAT ATA TCT CAT GGC att ^ ^ AAC ATC AGA CCT 

TTA AAT CAT GGT ATT IS TGC AAT tt! Jf ^ TAT ^ TCA TAT TGT 
ATT TTT GTA CTG CCT CTC TTA ^ Si ^ ^ ATA ^ ^ GAA CAC 
ATC CAT TTA GTT TTA TGT TCT AAA caI ^^A TGT AAC TGT TAT GTA TAA 
TTC AGT AAA ATA J£ JS SI CTA TTT GTG CAA CTC CAG ATT 
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Figure 14( A) 



Arg Gly 
Arg Arg 
Pro Arg 
Asp Arg 
Ser Asn 
Cys Pro 
Ala Glu 
Gly Asn 
Ala Val 
He He 
Lys Leu 
Glu Asn 
Val His 
Glu Gly 
Asp Gly 
Leu Leu 
Leu Val 
Asn Met 
Gin Pro 
Arg Val 
Arg Ser 
His Arg 
Leu Tyr 
Glu He 
Met Gly 
Ala Lys 
Gin Ala 
Gin Gin 
Asn lie 
Arg Trp 
Lys Arg 
Asn Pro 
Phe lie 
Lys Trp 
Ala Asp 
Glu Asn 
Leu Asn 
lie Gin 
Leu Gin 
Met He 
He Val 
Ser He 
Leu Arg 
Leu Arg 
Tyr Tyr 
He Pro 
Thr His 
Arg Asp 
His Ala 
Leu Val 
He Arg 



Gly Ser 
Gly Pro 
Ala Thr 
Phe Phe 
Asn He 
Gly Thr 
Asn Asn 
Ala He 
Phe Arg 
Phe Asp 
Asp Ala 
Asn He 
Lys Tyr 
Val Tyr 
Lys Gin 
Val He 
Ser Tyr 
Asp Asp 
Gly Ala 
Pro He 
Asp Asp 
Ser Thr 
Phe Glu 
Val Asp 
He Thr 
Thr Ala 
Ser Arg 
Phe Leu 
Pro Lys 
Leu Met 
Leu Arg 
Arg He 
Leu Lys 
Glu His 
Val Phe 
Leu Gin 
Tyr Asp 
Ala Leu 
Val Cys 
Arg Thr 
Gly Asp 
Met Gin 
Ala Thr 
He Asn 
Ser Gly 
Glu Ser 
Asp He 
Tyr Ala 
He Ser 
Gly He 
Lys Glu 



Arg Gly 
Pro Pro 
Ser Gly 
Asn Gly 
Gin Thr 
Glu * 
Leu Cys 
He Ala 
Leu Lys 
Phe Ser 
Lys Pro 
Glu He 
He Val 
He Gin 
Leu Leu 
Asp Gin 
Tyr Arg 
He Cys 
Lys Arg 
Asn Glu 
He Tyr 
Ala Leu 
Pro Ser 
Lys Tyr 
Val Asn 
Leu Asn 
Tyr Ala 
Lys Glu 
Leu Leu 
Leu His 
Gin He 
Leu Phe 
Glu Met 
Tyr Lys 
Ser Gly 
Ala Trp 
Asp Ser 
Glu Glu 
Gin Phe 
He Asn 
Leu Ser 
Glu Ser 
Phe Leu 
Gin Ala 
Glu Leu 
Met Phe 
He Glu 
Gin Leu 
He Phe 
He Lys 
Leu Val 



Leu Thr 
* Gly 
Pro Ala 
Gly Ala 
Pro Ala 
Ser Val 
Gly Gin 
Glu Leu 
Asp Arg 
Tyr Phe 
Glu Leu 
Val Thr 
Asp Leu 
Gin Thr 
Cys Glu 
Lys lie 
Tyr Ser 
Lys Leu 
Pro Ser 
Ser Phe 
Asn Gin 
Ala Asn 
He Leu 
Phe Pro 
Leu Val 
Asn Thr 
Thr Val 
Gly Tyr 
Asn Cys 
Thr Ala 
Lys Asp 
Gin Leu 
Phe Lys 
Lys Glu 
Val Lys 
Phe Arg 
Thr Ala 
Val Gin 
Leu Ala 
He Lys 
Phe Ala 
He Arg 
Lys Leu 
Asn Arg 
Val Ser 
Thr Ser 
Val Pro 
Gly Pro 
Thr Glu 
Val Asp 
Lys Arg 



Arg Ser 
Arg Gly 
Arg Ala 
Asn Leu 
Leu Arg 
Arg Leu 
Ala He 
Leu Arg 
Ala Asp 
Lys Gly 
Gin Asp 
Arg Phe 
Asn Arg 
Leu Glu 
Ala Leu 
Glu Gly 
Ala Ala 



Leu Arg 
Asn Tyr 
He Ser 
Val Ser 
Gin Ala 
His Thr 
Asp Asn 
Asp Ala 
Leu Asp 
Ser Glu 
Leu Arg 
Leu Arg 
Asp Ser 
Gin He 
Leu Leu 
Gin Met 
Gly Ser 
Pro Leu 
Glu He 
Ala Gly 
Glu Phe 
Asp Thr 
Glu Glu 
Trp Gin 
Val Asn 
Ala Ser 
Pro Asp 
Tyr Val 
Leu Leu 
Thr Arg 
Arg Tyr 
Gly He 
Pro Lys 
Val Ala 



Arg Ser 
Gly Asn 
Pro Trp 
Ser Ala 
Pro Val 
Thr Met 
Leu Arg 
Leu Ser 
Gin Gin 
Pro Glu 
Leu Asp 
Tyr Leu 
Tyr Leu 
Thr Val 
Tyr Leu 
Glu Val 
Arg Ser 
Ser Thr 
Pro Glu 
Met Val 
Ala Tyr 
Ala Met 
His Gin 
Trp Val 
Trp Glu 
Leu Ser 
Arg Val 
Glu Glu 
Asp Cys 
Ala Cys 
Leu Thr 
Asp Thr 
Leu Ser 
Glu Arg 
Thr Arg 
Ser Lys 
Arg Lys 
His Gin 
Arg Lys 
Val Leu 
Leu He 
Pro Ser 
Ala Leu 
Leu Leu 
Arg Lys 
Lys He 
Leu Asp 
Glu Val 
Leu Met 
Gin Leu 
Phe Ala 



Gly Thr 
Trp Val 
Arg Gly 
His Leu 
Asn His 
Leu Asp 
He Val 
Glu Phe 
Lys Tyr 
Leu Trp 
Glu Glu 
Ala Phe 
Asp Asp 
Leu Leu 
Tyr Gly 
Arg Glu 
Ser Ala 
Gly Tyr 
Ser Tyr 



He Gly 
Pro Leu 
Leu Tyr 
Ala Lys 
He Ser 
Pro Tyr 
Asn Val 
His Ala 
Met Val 
Asn Val 
Asp Pro 
Asp Ser 
Ala Gin 
Glu Lys 
Met Thr 
Val Glu 
Gin He 
Thr Val 
Leu Glu 
Phe Leu 
He Thr 
Asp Ser 
Met Val 
Asp Leu 
Ser Val 
Val Leu 
He Lys 
Lys Asp 
Ala Lys 
Met Lys 
Leu Glu 
Leu His 



Ala Asp 
Pro Ala 
Pro Ala 
Val Ser 
Pro Gin 
Phe Leu 
Ser Cys 
He Pro 
Gly Asp 
Glu Ser 
Phe Arg 
Gin Ser 
Leu Asn 
Asn Glu 
Val Met 
Arg Met 
Asp Ser 
Ser Ser 
Phe Gin 
Arg Leu 
Pro Glu 
Val He 
Met Arg 
He Tyr 
Lys Ala 
Arg Glu 
Gin Val 
Leu Asp 
Ala He 
Asn Asn 
Arg Tyr 
Phe Glu 
Gin Thr 
Glu Leu 
Lys Asn 
Leu Ser 
Gin Leu 
Ser Asn 
His Gin 
Met Gin 
Phe Thr 
Thr Lys 
Pro Leu 
Ser Gin 
Gin lie 
Leu Gin 
Lys Leu 
Leu Thr 
Thr Thr 
Asp Gly 
Arg Gly 
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Figure 14( B) 



Leu He 
Lys Glu 
He Gin 
Val Ser 
Leu Arg 
He Pro 
Gly Arg 
Cys His 
Glu Val 
Thr Phe 
Lys Glu 
Asp Arg 
Pro Leu 
He Ala 
Lys Val 
Asn Tyr 
Asn Leu 
Pro Ser 
Thr Ala 
Tyr He 
Phe Leu 
Met Val 
Leu Gly 
Gin Leu 
Cys Thr 
Leu Leu 
Val Ala 
Val Leu 
He Phe 
Ala His 
Tyr Glu 
Leu Asn 
He Phe 
He His 
Phe Ser 



Phe Asn 
Leu Gly 
Asp Tyr 
Arg He 
Thr Lys 
He Pro 
Leu Cys 
He Asp 
Thr Ser 
Gly Leu 
Leu Gin 
Thr Val 
Lys Ser 
Lys Thr 
Gly Gin 
Ser Cys 
Asn Lys 
Leu Pro 
Tyr Leu 
Thr Thr 
He Ala 
Cys Arg 
Leu Leu 
Leu Ala 
Ser Gin 
Phe Leu 
Glu Ala 

* Leu 
Pro Pro 
Thr Thr 

* Asp 
His Gly 
Val Leu 
Leu Val 
Lys He 



Pro Arg 
Ala Thr 
Val Asn 
He Asn 
He Gin 
Lys Phe 
Arg Glu 
Gin Leu 
Ser Arg 
Asn Gly 
Asn Phe 
Gin Asp 
He Val 
Gin Lys 
Met Gin 
Arg Phe 
Ala Leu 
Tyr Pro 
Glu Ala 
Lys Arg 
Gin Leu 
Lys Pro 
Thr Leu 
Leu He 
Lys He 
Glu Asp 
His Val 
Phe Phe 
Ser Gin 
Ala Phe 
He Ser 
He Thr 
Pro Leu 
Leu Cys 
Val Leu 



Ala Lys 
Met Asp 
He Tyr 
Tyr Asn 
Asp Trp 
Thr Pro 
He Leu 
Asn Thr 
Leu Phe 
Leu Asp 
Leu Ser 
Thr Leu 
Ala Asn 
He Trp 
He Leu 
Asp Ser 
Leu Ala 
Lys Glu 
Ala Gly 
Leu Pro 
Pro Lys 
Thr Asp 
Leu Lys 
Gly Gin 
Pro Glu 



Tyr Val 
Pro Asn 
Leu Leu 
Met Asn 
Phe Leu 
His Gly 
Cys Asn 
Leu Asn 
Ser Lys 
Leu 



Pro Ser 
Gly Phe 
Gly Leu 
Val Glu 
Gin Ser 
Val Asp 
Arg He 
Trp Tyr 
Ser Glu 
Arg Leu 
Met Phe 
Lys Thr 
Ser Asn 
Thr Ala 
Arg Gin 
Lys His 
Asp He 
Asp Asn 
He His 
Tyr Phe 
Leu Gin 
Pro Val 
Gin Phe 
Phe He 
He Pro 
Arg Tyr 
Phe He 
Leu Gin 
Leu Lys 
Ser He 
He Ser 
Leu Tyr 
Ala Glu 
Glu Leu 



Glu Leu 
His Arg 
Lys He 
Gin Glu 
Met Tyr 
Glu Ser 
Thr Asp 
Asp Met 
He Gin 
Leu Cys 
Gin Lys 
Leu Met 
Lys He 
Tyr Leu 
Gin He 
Leu Ala 
Glu Ala 
Thr Leu 
Asn Pro 
Pro He 
Tyr Asn 
Asp Trp 
His Ser 
Cys Ser 
Ala Asp 
Thr Lys 
Phe Asp 



Trp Lys 
Met Lys 
Met Gly 
* Tyr 
Gin He 
Cys Asn 
Phe Val 



Met Pro 
Ser Phe 
Trp Gin 
Cys Asn 
Gin Ser 
Val Thr 
Pro Lys 
Lys Thr 
Thr Thr 
Phe Met 
He He 
Asn Ala 
Tyr Phe 
Glu Ala 
Ala Asn 
Ala Ala 
His Tyr 
Leu Tyr 
Leu Asn 
Val Asn 
Lys Asn 
Pro Pro 
Arg Tyr 
Thr Val 
Val val 
Leu Pro 
Glu Phe 
Asp Cys 
Arg Asn 
Asn He 
Asn * 
Lys Ala 
Cys Tyr 
Gin Leu 



Lys Leu 
Glu Tyr 
Glu Glu 
Asn Phe 
Thr His 
Phe He 
Met Thr 
His Gin 
Leu Gly 
He Val 
Leu Arg 
Val Ser 
Ser Ala 
He Met 
Glu Leu 
Leu Glu 
Gin Asp 
Glu He 
Lys He 
Phe Leu 
Leu Gly 
Leu Val 
Thr Glu 
Glu Gin 
Gly Ala 
Arg Arg 
Arg Thr 
Pro * 
Ser Val 
Arg Arg 
Tyr Cys 
Glu His 
Val * 
Gin He 
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+ Strand < aon8 «> sequence (5.~>3.> 

1st base 



1. pchl3-sp6-lf 

2. pchl3-sp6-2f 

3. T7.1 

4. T7.2 

5. T7.3 

6. T7.4 



370 
726 

1140 
1361 
1602 
2041 



7- ch!3-2480 2486 
strand (antisenee) 



8. SP6.1 
9- SP6.2 

10. SP6.3 

11. SP6.4 

12. pchl3-t7-lf 

13. pchl3-t7-lf a 

14. pchl3-t7-2fa 

15. CH13-AS-1 



2746 
2490 
2213 
1812 
1165 

712 

286 

536 



TTT ACT TCT AAC GCT TAT TC 
TGA. AGG ACT CCT TTG AGA CG 

TCA CAA TGG GCT ACT GG 
TTC AAC GAG GGA GAT GG 
TTA GCA CCA CTG AGA GA 
GTT CTT TTA GGC ATT TA 
GCT GCG TCT GTT CGT CAG C 

CCT CTG CTT CAC AAC AT 

GCA GtA GGG CGG ACA CC 
(C) 

AGG GTC TTC TTC ATT GT 
GGA TTG TCT TTG TCT CT 
AGT GCA CTT CCA TGG GCG TG 
CCT TCA TCA GGT TGA CGA AC 
GCG GCA ATC AGA AAC GGA AG 
TGA ACA CGT GGT ACA T 
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Figure 16( A) 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 
701 
751 
801 
851 
901 
951 
1001 
1051 
1101 
1151 
1201 
1251 
1301 
1351 
1401 
1451 
1501 
1551 
1601 
1651 
1701 
1751 
1801 
1851 
1901 
1951 
2001 
2051 
2101 
2151 
2201 
2251 
2301 
2351 
2401 
2451 
2501 



1 AAATGAAGGC 5SS AG^SSS TATAAACTGG 

AGAAGCATCC CTCCTTCCCT XjGCttSSS ^^AATAACGC 
CAGTAGGTGG tSSSS TrGTCC ^ 
GTCGTCGTCC GTTTGCATCA GGAMTCT^r SSSSSF A GCAACAGGA 
GCCTCTAGAC TCCATCTCTC ATAGACAaTt ^-S 0 ™ 00 GTTTCTCATr 
ACCAGTCTCT ^nSS SKSE! ^ACAGAGA 

TATAGGAAAC CACTGATTGC TKncT^Z ^f™^ ""^TACCTTA 
TTTAACAGCA ATTCTGCaS SS ™^ AT TAG GAGAACA 
GAGTGCCGGA CCTCGCACAG aSSS ^^GAACA 

gggcagcagg cgctcctcca gcacS£ S^S 0 ggtcaggggc 

AACAGCGATC GTAATCAATC CTTA^^T ^^TCA AGACTTTTGG 
TClTGGACTT CAA™ SSSS GTC ^C? 
AAGAATGAGC GGTTCGTCAA rrrri^ TGATCGAGGT CTGCTTCCAG 
CAACAAGAGA CCCAACAA^ SSSS^ 5 AGTCCTTrc AGACGTTCAT 
GAAAGTTAAG AGCAGGCAAC aSSJ™ ESS?** 5 ^TCGATT 
ACGTTGGACA AGATCATGAT C^SSJS^ ^53°^ GCTCGAGCGG 
CTITCAAGCA TTTTATAAAA SSS 21^^ GTAAAGATGT 
AAAGTGCCIC AGTCGATGCT GAAAAgS CTTC TTX2GGA 
GAGTGCGGTC CAGCCTTCAC S^aSSS SSSSS* GCTCAAG CAT 

G^GcrrrcG aaggacatca tcgSS SS^ T ^^cat 
agagtgactc aggccctata gaccttaAIJ S aa ? cagcat atgcagaatc 
tactcgccaa catacacgoc SSS^ tgaacatact cacaatcggc 
taaacttcag gaagtattS a^JS^I 0 ca^aaccc cagaaatgat 

GAAAACnCA SSSS aSS? CACAGTGGTC 
™AAGAAG GGAAGAAGGA aSSKSS ^^2^3^ AAAAGCGGAG 
GCTCCTCATG TTCAACGAGG S^S^ AGACACTGGT 

TGGCCACGGG GATAGAGGAT AgSSS SSTI^ ^TAAAAA 
GCCTGTCGCA AAGCACGTCT SS ^CAGAACGCT GCAGTCCCTC 

ggaagatcga gacaagttca ttSIT^ agtcccaaag gaaaggaagt 

TTAGAATAAA GATCAATCAA S^ActS JS?™ 0 CACAAGTTGT 
CTTAGCACCA CTCAGAGAGT gSSSSS 

TCCTATCGTC AGAATAATCA AGAtcS 2SS5S£ AGATTGATCC 
TAGTTTCTGA ATTATATAAT OWY^rTTh*. S^™ 307 CATAATCTTC 
TTGAAAAAGA GAATTGAATC 2SS2££ TTCCAGTAAA GCCTCGAGAT 
CAAAGACAAT CCGAATCaS AGAGACTA T A TCGAGAGAGA 

GGTTCCCCTT cSSaSS SEKSf ATC ™CAGAC 

CTGTGCCATT TCTCGGACTC SS SSS^ 00 * GGAA GCACAC 
GAAGGAAGGG AGGTGGCTCC ^S? 71 ^ 0 ATTGGAAGGC 

AACCTCCAGA TGTATCTTTT tSS^S ^-AAGACTTC 
GCATTTAAAT TGTTTCTGTT aSSSS GTTCTTrTAG 
AAGAAGATCT TACTAAAGAG ^22™ GAGATTGGAC 

CAAAAAGCTC CAAGTTTGGT rSS3 AAA ^ GTCTr GTTCTTCTCT 
TCAAGAAGAC C^SSS SSSl ™^ GAGTGCACAA 
ATCCCTGAAG ACAGCTCGCT CAgS^ SSSS*** '"^AGGT 

GGCcxrrrcAT gggtcaaS tSSS? S£^ G ^aaaacaag 

TCGATGACGC ACCCTAGCCA CTGGCCCCTT ^S?^ AGCTGGCGAA 
AAGTTGTAAA CTTTGGTGGC SSSSS ^ATTTCCAA 
GCTCCCTCAG GTGCCAAGGC SS^^ ^^GTCA 

CAGCTGAGTT CCTTGTGAAT CTCtSS ^3^° ^CTCTTCGT 

c lx - rGTTTTA GGGGTTGGGG CTAGTGTCTT 
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2551 

2601 

2651 

2701 

2751 

2801 

2851 

2901 

2951 

3001 

3051 

3101 
3151 
3201 
3251 
3301 



ss sss »««»» ss ssss 

GAGTCTCTIT TTCAAACATC CGGctS? ATrTTCTGCT AAATTaSU 
CCATATTAAA ATCCTCACTr JCT^^ ATAT ^CACC TOnxxSrJ 



9 
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Figure 17 



1 FPEPFLPV-E AEGGMSALPF SVRWCIKTFL INWK.RLGKM AKISNPWMWa 
£ WACUXSVGG F-KGLPSASL ATGftA/VRLHE eSfSt^ 

1 PLDCICHRQM PPSFTENQSL L-TLLLTLIL FTLYRKPLIA CVEKoT^ 

U SiJ5S5££ HLLDENRVPD LAQMYQLFSR VRGGqS2£ S^f2 

} ?£f)? NPEKI) KDMVQDLLDF KDKVDHVIEV CFQ^ERFVN UKESfS? 

f IAKHVDSKLR AGNKEATDEE U^TLDKBtE uSSSS 

)1 FEAFYKKDLA KRLLVGKSAS VBAEKSMLSK LKHECGAAFT SSSSS 

.1 ELSKDDJVHF KQHMQNQSDS GPIDLTVNTL SSS 

f ^^RKLQ WQTTLGHAVL KAEnSS SS 

} ^f^P SFEE IKMATG IEDSELRRTL QSLACGKARV LIKSPKG^V 

1 EFKHKLFRIK INQIQMKETV EEQVSTTERV FoSoYoS 

} ^*f VPL kglvlv sksc kfglfIcvim S£KK£ ESE&SS 

I ^^f? 00 HLE * K Q GP ™ geh.kepgfk AGEWMTHP^ S£SrS 

1 SCKLWWLIFR KSGF-VSSLR CQGHGVRPAA SVRQLSSL.T SVLT^rf^ 

1 yFPF.D.VWQ SLFFCIGVTA L-FFLIAVFV «LQ« «SLVWF K^SfS 

1 ^SSS ^!: CCEAE V ^WKD.KDF VGWwS SS 

i 1 EEZS 



1101 TLIKLCDMQM TO 



201 TATVTTJPPirn irr™*-™ r MYQLFSR VRGGQQALLQ HWSEYIKTFG 

251 ^YHEf^ KCMVQDLLDF KDKVDHVIEV CFQKNERFVN U^FCTFT 

sees ssss oss jsssss SEE 

50 ZS SSSSS SSSP ° s "^ 

51 SS«=SKSSESS 
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♦ strand (sense) 



sequence (5 f -->3«) 

1st base 



1- PChl4-SD6-lf /r p zr 

686 GGC TOA ACA ^ G ^ ^ 

2. pchl4- S p6-2f ionq 

1005 CTA TGA AAA GAC AGC TTA AG 

3- pchl4-SP6-3f 131s 

1315 ATT TAG m GM ^ 

4 pchl4-sp6-4f ic flQ 

1589 CAG ACT TTA AAG TCA CAA G 

pchl4-sp6-5f ionp 

"08 CAA AGA CTT GGT GTA TAG TG 

(5«— >3«) 
GCA GTT TAA TTT GGT CCT G 
CTG TAA TTA TAG TTC TGT C 
CTT GTG ACT TTA AAG TCT G 
ATA ATC ATG CTT TTC AAA C 
TTA AGC TGT CTT TTC ATA G 
GTA CAT TGA GTG TTA AAC C 
CGG CAG AGC TGA CTA CTG GAA GG 
CAA GCA GGG AAG TAA CGG CAG 

CTT GTT AGC TTG TTT AGA AGC 
TGG AAG AG AGG 

GGJ GCA AGA GAA GGT CTC CTT 





strand (antisense) 


sequence 


6. 


pchl4-sp6-6fb 


2020 


7. 


pchl4-sp6-5fb 


1757 


8. 


pchl4-sp6-4fb 


1607 


9. 


Pchl4-sp6-3fb 


1339 


10. 


■pchl4-sp6-2rb 


1023 


11. 


pchl4-sp6-lrb 


704 


12. 


CH14a 


629 


13. 


CHl4b 


644 


14. 


CH14c 


109 


15. 




90 
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Figure 19 



51 
101 
151 
201 
251 
301 
351 
401 
451 
501 
551 
601 
651 



1 "TCTGATTTTG ^^^G 

ACAAAAACAA CTAACTACTC TACAGITV^a ST^ 001 ^ AGAATCCGTA 
TCCTCCCAGA ACTCGAACTT ScaSaaS SS^P** ^^CACT 
AGGGACAAAG TAGGACCCCC AGAATAAGTC cS^T^ ^TCGTXX 
ACAAAAGGAG ATTCTGTAGA AAAAAATr^ 2^°*™* AGAAGAGGAA 
TCTGGCACAG AAACCAGAAA AActJS^ ^^ ATGA GTCAACTCAG 
CTTGTAAAAA TGGGGATGAG TgSSaS TACTGGCCTG 
AAAGCCTTCC CCAATTCTAA ATrrcSS aT*^!^ CTCACCCTCC 
AAATTGTAAA TATGATGCAA AgS^ ^T^™ 71, ^^TTCACCC 

atgtgagtag aagaatttca SS ^ GATTCT cccttcactc 
GCAccAccrr ccagtagtca gctSSS? S^^ 07 ^caccacca- 
gatggaatgt cccttctatc atoSSa^ I^r 1 ^ 0 ^ cttctaagaa 

701 GTACAAGTCC GGACTCCACA Sc^S TT? TAGGTTT AACACTCAAT 

"1 CGACATGCCT TGAAATGGAT Srl^T? CCACCA TTAA TGTCCCACCA 

801 CCTCCCIGGC SSSSSJ SSSJSf iS*£°?* T A <^SaS 

AGATACTCTA CAGAACTTGT T ACTCATCAA 
95! JCATAATATG AAGTTTTATr GCcJSSS JJSSS* TATATTGCTT 

, X^ 1 AGTTTGTAAG TTTATTAir-T r£™™7 <-TGAAGTGTC TAATTTTTPA 

1001 TTTACTATGA aUSaSgS SSa^ 

1051 TGGGGCATGT TTGTGCACTC SSS^r ^T^ 1 ^ ^AAAATATT 

1101 ATCATGGTTA GTCATGGTAC Tr£T£^ G ^^A^TA TGAAATTGAC 

J"l TGAGTX3GAGA SScS iSS? 0 ™ 0 * 

1201 ACTTTCACTT TTCCCAAAGA SSaa^ ^1^°™ AAATTCTACT 

HI] ^GCATrGGC CAAAGGTACT GaSJS aIS^ C ACCA ™AAA 

1301 TTAGTTrTTA AGTGAATTTA GTTTrT^Z AAAATA TTCA ATTCTGCTTT 

1351 GAGGCTCAGT GCTACTTTCG r^fT^* 0 ^^TTATA CAGGCCTCTC 

1J01 CAGGATGAAT GAGGLES SS^ES JES" 0 " 0 CC ™S 
1501 ^CAGTTC SSSSS ^raCAAGT 

1501 ATGCCTTCTA AATAATTTTT TTCGGAaS I^*™ 0 TAATAACATA 

1551. AAATTTTTTT ACAAGTATTT Ar^^ ACATT ATCAC AAAATTATAC 

1601 TCACAAGATT ATAAA^SJ SS ^^AAAACA GAC^Sa^G 

J"l rrCTCAGAAT CCACAGAAM iSSS T^ACATTCTG AAAAATAACA 

1701 GAAATGTAAA AATTAGATTT AAaSS ^ ACTCAA GATAATTITr 

1801 ATTACAGAGA TCAGATCAGA SSKSS SSE™ ^^CTATA 

1801 crrrrGGccT actotattac ^acZ^Z £Ef^ TAGA taggatcaaa 
o 5 gcaagaSS SSS^ tcgt ™ 

1301 CTGATTTCAA AGACTTGGTG TATAgSttT T^T? 01 ™** lAACAGATCA 

1951 GGTTAGAAAA GTGGATTAAT SIS^- ^^TTAAXG CTTAAAAGGT 

2001 TCAGGACCAA AWAAaS5J ^AATAAAGAC TX^CAACaSJ 
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si SSSSS SSKSK S52S KAIS ^ E - 

101 TKGDSVEKNQ; SSS ySSJe SSS 1 ™ 

151 KAFPNCKFAE KCLFVHPNCK YDAKCTKPnr dE™^ CAYHHPISPC 

201 APPSSSQLCR YFPACkS £5S£S£ JST*" 

251 RHALKWIRPQ TSE-HPVLPG RRSC^/£m ^STf^ ^"IM/PP 

301 S'YEVLLPIY LKCLIFOVCK nS^ Y**KILYRTC QIFETWNILL 

351 WGMFVHCCCE ££££2 gS£^ A *GRAKFC«NI 

401 TFTFPKDYIH FIIHHENSIG OR^GcfS2 ^7^°' GSCHYSKNCT 

451 EAECYFR.SS SFPAFCDrS E^iSgS SS'S 

SOI MPSK-FFWET TLSQNYTNFF TsFyrr vT^ S^ #B0F ^LSFTNNI 

S51 FSESTENILS YY^FUrac m t ^ DFKVTRL'MY ICILTF-KIT 

601 LLAYCITYRV fLtSvSS A^SS Q ^ RDQIR -VNCKIDRMK 

651 G.KSGLMQKG ..RLQHSQD^ Jg^^^ * Q ITDFKDLV YSVKN- SLKG 



EDDDYGSRTC SISSSVSVPA KPERRPSLPP 
TKTTNYSTVP QKQTLPVAPR TRTSQEELLA 
TKGDSVEKNQ AEMSELSVAQ KPEKLLERCK 
KAFPNCKFAE KCLFVHPN CK YDAKCTFCPnF " 
?!!fff?^ YFPACKKMEC nwnn^rr 
RHALKWIRPQ TSE " 1 



SKQANKNLIL 
EWQGQSRTP 
.YWPACKNGDE 



PFTHVSRRIP 
" 'SPDCT 



KAISEAQESV 
RISPPIKEEE 
CAYH HPISPC. 



VLSPKPVAPP 
FYHPTINVPP 
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Figure 21 



1 


AAAACTTTCG 


GAAGAGAAAG 


51 


AAAAAATTCA 


ATCATGATGG 


101 


GTCTCGAACA 


GGAAGCATCT 


151 


AAAGGAGACC 


TTCTCTTCCA 


201 


TTGAAGGCTA 


TATCTGAAGC 


251 


CTCTACAGTT 


CCACAGAAAC 


301 


CTTCTCAAGA 


AGAATTGCTA 


351 


CCCCAGAATA 


AGTCCCCCCA 


401 


TAGAAAAAAA 


TCAAGATTAC 


451 


ACAAGATCAT 


TTATTCTGAA 


501 


GGCACCAAAC 


CAAGANTCGG 


551 


TTTCAGGGAC 


CCTTATGCAG 


601 


GCAAGTCCCA 


AG 



TTGCCTGTGG TAAGTTCAGT TGTTAAAGTA 
AGAAGAGGAG GAAGAAGATG ATGATTACGG 
CCAGCAGTGT GTCTGTGCCT GCAAAGCCTC 
CCTTCTAAAC AAGCTAACAA GAATCTGATT 
TCAAGAATCC GTAACAAAAA CAACTAACTA 
AGACACTTCC AGTTGCTCCC AGAACTCGAA 
GCAGAAGTGG TCCAGGGGAC AAAGTAGGAC 
TTAAAGAAGA GGAAACAAAA GGAGATTCTG 
TATGACATGG AATCCATGGT CCATGCAGAC 
GAAGCCAAAG CTCTCTGAGG AAGTANTAGT 
GGATGAAGAC TGCAGATTCC CTTCGGGTTC 
ACACNAGATC TTGTTCAACC AGATAAACCT 



1 KTFGRESCLW *VQLLK*KNS 

51 KGDLLFHLLN KLTRI • F • RL 

101 LLKKNC-QKW SRGQSRTPRI 

151 TRSFILKKPK LSEEVXVAPN 

201 ASPK 



IMMEKRRKKM KITGLEQEAS PAVCLCLQSL 
YLKLKNP-QK QLTTLQFHRN RHFQLLPELE 
SPPIKEEETK GDSVEKNQDY YEMESMVHAD 
QXSGMKTADS LRVLSGTLMQ TXDLVQPDKP 



1 NAGCTGCTCT GACGGGNAGN GGAATGNATG GNGGCTTGTT CNGAAACNNG 

51 CCAGATGGCG NGAGGGGGAC AAGTAGCGGC GTGATTOAGA AGAGGGAGGT 

101 GAGGGTNCTC ACATCACCMC ATCTNACCAT GNCGNGCCNT CCCCAotSJ 

151 AANANTGATC ATAGNGGGAA GTGGGCCCAC CCAGAAGCNT GATTGAGCGG 

201 CCGCCAGTAN GAAACNNGTT TGTCCANTTA GNCATACNNA T^TaSS 

251 CNAGCNGCGT CCCCGGCACC NGCANANNNN CNNCNGGGAC NACNGCCCNN 

301 NNOTNNGTTA NNCNGNGNAG NNAAAAAATT CAATCATGAT GGAGAAGAGG 

]l\ AGGAAGAAGA TGATGATTAC GGGTTTCGAA CAGGAAGCA T CTCCAGCAGT 

401 GTCTCTGTGC CTCCAAA ' " ^^f^MST 



Untitled translated in RF 2 



1 SCSDGXXNXW XLVXKXARWX EGDK « RRDXE EGGEGXHITX SXHXXXSPXX 

im Q KXD ' AAASX KXVCPXXHXX XRVXXASPAX aSSS 

101 XXLXXXXKKF NHDGEEEEED DDYGSRTGSI SSSVSVPA 
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Figured 



CH1 -9a 11-2 



CAA ATG GAA GAA ATG CAA AAG GCT TTC AAT AAA ACA ATC GTG 
AAA CTT CAG AAT ACT TCA AGA ATA GCA GAG GAG CAG GAT ci G CGG CAA 
ACT GAA OCC ATC CAG TTG CTA CAG GCA CAG CTG ACC AAC ATG £ 

Lys Gin Met Glu Glu Met Gin Lys Ala Phe Asn Lys Thr lie Val Lv, 

Leu Gin Asn Thr Ser Arg He Ala Glu Glu Gin Asp Gin Arg Gln rlr 

Glu Ala He Gin Leu Leu Gin Ala rin r« * 9 hr 

Val Gin Thr Asn MeC Thr Gi " Leu 



CH8-2a13-1 

GAA CAG GCA AGC 
GTG CAG CAA TTT 
GAC AAT ATC CCA 
ATC CGA TGG CTG 

Glu Gin Ala Ser 
Val Gin Gin Phe 
Asp Asn He Pro 
He Arg Trp Leu 



AGA TAT GCT ACT 
CTA AAA GAA GGT 
AAG CTT CTG AAC 
ATG CTT C 

Arg Tyr Ala Thr 
Leu Lys Glu Gly 
Lys Leu Leu -Asn 
Met Leu 



GTC AGT GAA AGA 
TAT TTA AGG GAG 
TGC CTG AGA GAC 



Val Ser Glu Arg 
Tyr Leu Arg Glu 
Cys Leu Arg Asp 



GTG CAT GCT CAA 
GAG ATG GTT CTG 
TGC AAT GTT GCC 



Val His Ala Gin 
Glu Met Val Leu 
Cys Asn Val Ala 



CHl3-2a12-1 



CH14-2a16-1 

TG TTT GTT CAC CCA AAT TGT AAA TAT GAT rr/v ^™ 

= si s is s s r - ~ ™ - - - 

Phe Val His Pro Asn Cys Lys Tvr Asn ai^ r „~ ^ 

Cya Pro Phe Thr „ is / al s J r ^ j£ * * j£ ^ L ys Pro ^ 

Pro Val Ala Pro Pro 9 ° V&1 LCU Ser Pro ^ys 
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Bqure 23(A) 



rrr^T 00 GCTGCCAGGA CGCGAGCCAC TGAGGAGCCG CTCAGCCAGC 
GCCATAGCCC TTAGGACTAT CGGTCACATT CTCGCGCTPr 

Sc T c c c££r! S CCTCGGCA GTCGro3CTC SSSS SESSS 

=1 SEES SEES EE i S 

AAGAATACAG AGTCAAAAAA GTTAAGTCCA CCGCTCr^n ^ 
TACAGTTGAT TTGCATGAAG AGTCTTCCAA JS? 

ssssss sssss sssss SE ~ 
sssss ssss: ssss = ™ 
ssss sssss ~ EES = - 

GTGATCCAAA ATCTGCATTC amyvSIT AGCA AGTGAA CAGGGCGGTG 

tctgattata ESSSS SESE r TGAGAGC 

CAAAGATCCA GAAGATATAC CAACATTTGA 

TGGAAGTAGA AAAAGAAAAA AGTCACTCGA TGCATrrsTr ^AGAAAGTTA 

, rrir ;„ f TCTTATAGAA AATATGGATC TTTACATGTT GAATCCTTCT 
AG CACT AAAA TTTGGTTTGT TATTGAACTT TrT^inn, ^AlCCTTGC 
ACAGCTTGAT ATTGCAAATT AlJSJSJ TTCTTCTACT CC^^ 
TTCTGGTTTC T ATCAPTP a r* a p^»J i^TTCTACT CCTAAAGATT 

=ot™ c i™ a a sssss ££££ ST 

SEES" TATCCAAMT *«™««r SagS! tcaSS™ 

CTTTTGTCCA TTAAGCGTTA TAAGGGTATT TCGCaSUc 
ATGGTGGAAG AATATGAAGA AATTGCTGAT TCCCAGTaTn 

GAGAGGATAA ATCCTCAAAA AATCTTCTTG GTTCTGCTAP IaI™ 
CTAAATATGG TGAATATTGC TG CTAAT ATT C^rr*Z» » AAATGCCATT 

ssss see ss£ SEi =? 
ssss ess see « i= 

AGGAGGCAAG TGCATGTAGA GTGACCCTTC SSSf 

gatgaatcat caggctggtt tgagtcagag aScSa?a1 ^Sg?" 
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ACTGACCACA ATTTGTTGTA TTTCTAGTTT TT^n»»-,^ 
GGTGTTCAGT TAGAGTTGCT CTTTATCGTT Jrnr-T ° ATATATAA AT 
AGTAAAGGAA AAGATTATCT TGTGTTAGCT 

TGCGGAATCA GTAGATGTTT CAGTATTGCA ACCTCTGAGT G^^ 0 
AAAATACGAA TATAGAAAGG GAAGCTGAAA CTG^gSS rrr^ ° 
AGTAGTAGTA TG C ACCAGG A TGACTTGGTG A^ScISS XSSf™ 
TGAACTTGAA CCAAGCCATT CTCAAACTCT TO^SS TAGATGCAGT 
ATATTACCCC AGAAATCAAT CCCTTGcJS aS^I CTTCTTTTAG 
GTTGAATATG AGGCAGGACA TaSSaSa SSZ?*™ ATCTGA ^T 
TTCTGTTGAG ATCGATAATG A^S 

CTATAGAGAA ACCATCTATT APPTAT™ AAAGTCTGAG AG CTTTAGTT 

ATGGATAATA ttSSES £££££ £££££ 2?™°™ 
GCTGTCTGAA ACAATAGTrr rar^^T. TCCATGCAAA TTTTCACAAA 
ATGAAGATGG SS^I™ J****** GTACCCGACA 

TTGATTTCTG TTGTGGATTC CTGACACAGC AAAGCAAACT 

ACAGTCTCCA GAAgSgSc mlZllr 2***™ AAGAAGAAGA 
ATTTTTATGC TGAATTGCAA aItoacap G ™ CAGAGG ^CAGCTACAG 
AATCTTGTAC ATGGATCAAA i£££S SEES" 

TAATCGTATT AAAG CCTTAG AAG TTAACAT GTctSJIS 
TGGAGGAGCT TAGCCAAAGG TArrr-aT^ GTCTCTCAGT GGTCGCTATC 
GCTTTCAACA AAACAATCGT ^aaS^ AAATGGAAG A AATGCAAAAG 
GGAGCAGGAT AATACT «»A GAATAGCAGA 

TGACCAACAT GACACAGCTT £SS££ SSSS """^ 
TTGAAACGGG AGGTTTCAGA T CG AC AAAG C SJf^ 
TCTTTGTGTT GTCTTGGTAr tpa^Z. 1ATCTTGTCA TATCTTTGGT 

CTTCTCAATT £S££E TATATTTCAA SST " 

TATCCAAGCC CTAAAAGGTG TTTCTCTTCC GAG 
AAGAAGAACT TCATTCCCAC TCATPiPiTP TATGATGATA TGAATTTGAA 
GCAAAGAAGT AGAC^ SS^S S"™^ 
TCTCCAGAAA AGAAGAAGAA GCGCTGCAAG TACAAAAtt^ CCTCAAGTTT 
GACCATAAAG CCTGAAGAAC CATTrr^ TACAAAATTG AAAAAATTGA 
AAGGAAGAAA GCCC^S AACCAGACAr £ TAGCCAAT GGCGACATAA 
GTTTATCACT CTTCTtSE ££££££ £££££ 
TTCATCACAG TCAGAAGAGT CCTATTlSS IttC^^ GCTCAGAAAC 
GTCTGTGCAA TGGACAGTCT r**T*lJl° TGGCATTTCA GCTTGCACAA 
AAACGAAGAC GATCTAAAGT C^lrr^ AAACTCMAA GAGGGCTTTA 
AATACAGACT AAG^ S£££ ££££ IST"" 
GAAACAAAGA GATCACCGTG GGAACATTTG £££££ ££££ 



WO 97/38085 



PCT/US97/05930 



CATATCTAAA ATTAATTGAA CTTTTCATAC AGAAGACTTT TTTGTTGTTG 
TTCTTTGAAG AACAGTCTGT AGTATTTGAA GGGTTTGGGG GAGGGAgIII 
ATATTAATGG G AAAGG CATT CAGAAATTAT GGTTTCTACC TTTTTAAAAA 
GTAGATGGGA TTGTGCTCAA TCTTGGTTAA TGAGCTACAG TTTTAC^ 
CCTATAAGGA CAATGGTAGA CATTTTATAA AGATGTTTTT 
SJSS AATTACTGGG ACAAAAGTAA TTTGGAAGCC CAGTTCCTtI 

GGTGGGATAG GAATGAAAGC CTAAACCTCT TCCTTTAGCT TTGTTCCTAT 
TTCTTGCACC TTCCCATATT TATGTGCCTT TTGTCTATTT I^ScCAC 
TGGAAGAGGA GGGATAACTT TTTCTGTTAT TTGATTTCTT SiSSSJ? 
GTTAGGTTTT TGAAGCTGCA AACACTACAA TGCTTTGAGG G^TctSS 
CTGAAGCTCA GGAGTGTGGA TCAGACAGTC TAAAGATCCT 
CAACTGGATC TTTGTTTAG C AAACTCACTG GAAATGAACA CTTaISgaa 
TTTTTAAGTC TGTTCTGTTA GGTAGATGGT GATGCTCTTG ££££££ 
TTATTCAGGC TGGATTACTT CTTACTTAGT TACTAACTCA ATGAgIISa 
AATCCCTACA GGATCTTTTT TTGCAAACAA CTGATATATG « 
TTTGACAAAT TCACCTTTTA AACACGACGT TAACCGATTT GTGAAGGTTT 
TCTTTAGCTT ACATTTTAAA CATACACAAT AAACACTAAT ££££££ 
TTCACTGTTT TTATTAGTAT GAATATAAAA TTTGAAGGTT SgCCA^S 
GTACAAGTCT CATGATATAA TCACAGCCTG CATACATATG CACA^CA 

gttagtgagt ttgtcaagct taatctaatt ggttaagtct 

TTATTCCTTG ATGTTTG CTT TGTATTGGCT ACAAATGTGC AGAGGTAATA 
CATATGTGAT GTCGATGTCT CTGTCTTTTT TTTTGTCTTT ^TM 
TGGCAGCAAC TGTATTTGAA TAAAATGATT TCTTAGTATG JJJSJJSSJ 
AATGAATGAA AGTGGAACAT GTTTCTTTTT GAAAGGGAGA gZ^gIcZ 
TTTATTGTTG TGATGTTTAA GTTATAACTT ATTGAGCACT ^SaSg 
ATAACTGTTT TTAAACTTGC CTAATACCTT TCTTGGGTAT TGTTTGTAAT 
GTGACTTATT TAACGCCTTC TTTGTTTGTT taagttgctg ct^Igg^I 
ACAGCGTGTT TTAGAAGATT TAAATTTCTT TCCTGTCTGC AcH^gS 
A.TTCAG AG C A AGAGGGCCTG ATTTTATAGA AGCCCCTTGA A^^CC 
AGATGAGAGC AGAGATACAG TGAGAAATTA TGTGATCTGT G^SSSS 
AAGAGAATTT TCAATATGTA ACTACGGAGC TGTAGTGCCA SagISctg 
TGAATTTCCA AATAAATCTG AACACTTGTC TTTATT TTAGAAACTG 
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Figure 24 



QRGLPGREPL RSRSASAIAL RTIGHILALL LRLLHLGLGS GGCREDVPPS 
GRGKKEEKMK KHRRALALVS CLFLCSLVWL PSWRVCCKES SSASASSYYS 
QDDNCALENE DVQFQKKNTE SKKL.SPPWE TLPTVDLHEE SSNAWDSET 
VENISSSSTS EITPISKLDE IEKSGTIPIA KPSETEQSET DCDVGEALDA 
SAPIEQPSFV SPPDSLVGQH IENVSSSHGK GKITKSEFES KVSASEQGGG 
DPKSALNASD NLKNESSDYT KPGDIDPTSV ASPKDPEDIP TFDEWKKKVM 
EVEKEKSQSM HASSNGGSHA TKKVQKNRNN YASVECGAKI LAANPEAKST 
SAILIENMDL YML.NPCSTKI WFVIELCEPI QVKQLDIANY ELFSSTPKDF 
LVSISDRYPT NfCWIKLGTFH GRDERNVQSF PLDEQMYAKY VKVELLSHFG 
SEHFCPLSLI RVFGTNMVEE YEEIADSQYH SERQELFDED YDYPLDYNTG 
EDKSSKNLLG SATNAILNMV NIAANILGAK TEDLTEGNKS ISENATATAA 
PKMPESTPVS TPVPSPEYVT TEVHTHDMEP STPDTPKESP IVQLVQEEEE 
EASPSTVTLL GSGEQEDESS PWFESETQIF CSELTTICCI SSFSEYIYKW 
CSVRVALYRQ RSRTALSKGK DYLVLAQPPL LLPAESVDVS VLQPLSGELE 
NTNIEREAET WLGDLSSSM HQDDLVNHTV DAVELEPSHS QTLSQSLLLD 
ITPEINPLPK IEVSESVEYE AGHIPSPVIP QESSVEIDNE TEQKSESFSS 
IEKPSITYET NKVNELMDNI IKEDMNSMQI FTKLSETIVP PINTATVPDN 
EDGEAKMNIA DTAKQTLISV VDSSSLPEVK EEEQSPEDAL LRGLQRTATD 
FYAELQNSTD LGYANGNL.VH GSNQKESVFM RLNNRIKALE VNMSLSGRYL 
EELSQRYRKQ MEEMQKAFNK TIVKLQNTSR IAEEQDQRQT EAIQLLQAQL 
TNMTQLVSNL SATVAELKRE VSDRQSYLVI SLVLCWLGL MLCMQRCRNT 
SQFDGDYISK LPKSNQYPSP KRCFSSYDDM NLKRRTSFPL MRSKSLQLTG 
KEVDPNDLYI VEPLKFSPEK KKKRCKYKIE KIETIKPEEP LHPIANGDIK 
GRKPFTNQRD FSNMGEVYHS SYKGPPSEGS SETSSQSEES YFCGISACTS 
LCNGQSQKTK TEKRALKRRR SKVQDQGKLI KTLIQTKSGS LPSLHDIIKG 
NKEITVGTFG VTAVSGHLN «LNFSYRRLF CCCSLKNSL. YLKGLGEGEN 
INGKGIQKLW FLPF'KVDGI VLNLG* *ATV LQS»SLPIRT MVDIL'RCFF 
TRLITGTKVI WKPSSLGGIG MKA'TSSFSF VPISCTFPYL CAFCLFIMPL 
EEECLFLLF DFFYNFVRFL KLQTLQCFEG VCA^SSGVWI RQSKDPKNLP 
TGSLFSKLTG NEHLMEFLSL FC'VDGDALV IFTYSGWITS YLVTNSMRKK 
SLQDLFLQTT DICRQIFDKF TF«TRR«PIC EGFL«LTF»T YTINTNPPNF 
HCFY.YEYKI .RFGQLVQVS • YNHSLHTYA QIQLVSLSSL LLVKSKEII 
IP'CLLCIGY KCAEVIHM»C RCLCLFFCL* KIIGSNCLI K«FLSMIVQ» 
•MKVEHVSF* KGEN'PFI W MFKL«LIEHF ••••LFLNLP NTFLGYCL-C 
DLFNAFFVCL SCCFRLTACF RRFKFLSCLH N«LFRARGPD FIEAP«KEVQ 
MRAEIQ»EIM •SVCCGKRIF NM«LRSCSAI RNCEFPNKSE HLSL 
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Eigum_25(A) 



TAGAATTCAG CGGCCGCTGA ATTCTAGCTG CGGGGTAGGA GTCCGCGGCA 
GCCTCCGGGT AAGCCAAGCG CCGCGCAGTG CTGAGTTCCC GCACGCCGCA 
GAGCCATGGA GATCGGCACC GAGACCAGCC GCAAGATCCG GAGTGCCATT 
AAGGGGAAAT TACAAGAATT AGGAGCTTAT GTTGATGAAG AACTTCCTGA 
TTACATTATG GTGATGGTGG CCAACAAGAA AAGTCAGGAC CAAATGACAG 
AGGATCTGTC CCTGTTTCTA GGGAACAACA CAATTCGATT CACCGTATGG 
CTTCATGGTG TATTAGATAA ACTTCGCTCT GTTACAACTG AACCCTCTAG 
TCTGAAGTCT TCTGATACCA ACATCTTTGA TAGTAACGTG CCTTCAAACA 
AGAACAATTT CAGTCGGGGA GATGAGAGGA GGCATGAAGC TGCAGTGCCA 
CCACTTGCCA TTCCTAGCGC GAGACCTGAA AAAAGAGATT CCAGAGTTTC 
TACAAGTTCG CAGGAGTCAA AAACCACAAA TGTCAGACAG ACTTACGATG 
ATGGAGCTGC AACCCGACTA ATGTCAACAG TGAAACCTTT GAGGGAGCCA 
GCACCCTCTG AAGATGTGAT TGATATTAAG CCAGAACCAG ATGATCTCAT 
TGACGAAGAC CTCAACTTTG TGCAGGAGAA TCCCTTATCT CAGAAAGAAC 
CTACAGTGAC ACTTACATAT GGTTCTTCTC GCCCTTCTAT TGAAATTTAT 
CGACCACCTG CAAGTAGAAA TGCAGATAGT GGTGTTCATT TAAACAGGTT 
GCAATTTCAA CAGCAGCAGA ATAGTATTCA TGCTGCCAAG CAGCTTGATA 
TGCAGAGTAG TTGGGTATAT GAAACAGGAC GTTTGTGTGA ACCAGAGGTG 
CTTAACAGCT TAGAAGAAAC GTATAGTCCG TTCTTTAGAA ACAACTCGGA 
GAAAATGAGT ATGGAGGATG AAAACTTTCG GAAGAGAAAG TTGCCTGTGG 
TAAGTTCAGT TGTTAAAGTA AAAAAATTCA ATCATGATGG AGAAGAGGAG 
GAAGQAGATG ATGATTACGG GTCTCGAACA GGAAGCATCT CCAGCAGTGT 
GTCTGTGCCT GCAAAGCCTG AAAGGAGACC TTCTCTTCCA CCTTCTAAAC 
AAGCTAACAA GAATCTGATT TTGAAGGCTA TATCTGAAGC TCAAGAATCC 
GTAACAAAAA CAACTAACTA CTCTACAGTT CCACAGAAAC AGACACTTCC 
AGTTGCTCCC AGAACTCGAA CTTCTCAAGA AGAATTGCTA GCAGAAGTGG 
TCCAGGGACA AAGTAGGACC CCCAGAATAA GTCCCCCCAT TAAAGAAGAG 
GAAACAAAAG GAGATTCTGT AGAAAAAAAT CAAGCTGAGA TGAGTGAACT 
GAGTGTGGCA CAGAAACCAG AAAAACTTTT GGAGCGCTGC AAGTACTGGC 
CTGCTTGTAA AAATGGGGAT GAGTGTGCCT ACCATCACCC CATCTCACCC 
TGCAAAGCCT TCCCCAATTG TAAATTTGCT GAAAAATGTT TGTTTGTTCA 
CCCAAATTGT AAATATGATG CAAAGTGTAC TAAACCAGAT TGTCCCTTCA 
CTCATGTGAG TAGAAGAATT CCAGTACTGT CTCCAAAACC AGTTGCACCA 
CCAGCACCAC CTTCCAGTAG TCAGCTCTGC CGTTACTTCC CTGCTTGTAA 
GAAGATGGAA TGTCCCTTCT ATCATCCAAA ACATTGTAGG TTTAACACTC 
AATGTACAAG TCCGGACTGC ACATTCTACC ATCCCACCAT TAATGTCCCA 
. CCACGACATG CCTTGAAATG GATTCGACCT CAAACCAGCG AATAGCACCC 
AGTCCTGCCT GGCAGAAGAT CATGCAGTTT GGAAGTTTTC ATGTACTGAT 
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Figure 25(B) 



GAAAGATACT 
CTTTCATAAT 
TCAAGTTTGT 
GTTTTTACTA 
ATTTGGGGCA 
GACATCATGG 
GTGTGAGTGG 
ACTACTTTCA 
AAACAGCATT 
TTTTTAATTT 
CTCAGGCTGA 
GACAGGATGA 
GTGCAGAAAA 
TAATGCCTTC 
ACAAATTTTT 
AGTCACAAGA 
CATTCTCAGA 
TTGAAATGTA 
TAATTACAGA 
AACTTTTGGC 
AAAACTGTTA 
CACTGATTTC 
GTGGTTAGAA 
TCTCAGGACC 



CTACAGAACT 
ATGAAGTTTT 
AAGTTTATTA 
TGAAAAGACA 
TGTTTGTGCA 
TTAGTCATGG 
AGAGATGCAG 
CTTTTCCCAA 
GGCCAAAGGT 
TTAAGTGAAT 
GTGCTACTTT 
ATGAGGTGGG 
TAGGAACAGT 
TAAATAATTT 
TTACAAGTAT 
TTATAAATGT 
ATCCACAGAA 
AAAATTAGAT 
GATCAGATCA 
CTACTGTATT 
AGGCAAGAAG 
AAAGACTTGG 
AAGTGGATTA 
AAATTAAACT 



TGTCAAATCT 
ATTGCCTATC 
TGTGGTTTTA 
GCTTAAGGAA 
CTGCTGTTGT 
TACTGCAGCT 
TGAGGCAGTT 
AGATTATATA 
ACTGAGGCTG 
TTAGTTTGAA 
CGGTAAAGTT 
TATGGACAGT 
TCTATACAGT 
TTTTGGGAAA 
TTACATACTG 
ACATATGTAT 
AATATACTTA 
TTAAATAGTA 
GATAGGTAAA 
ACTTACAGAG 
TGTCAAATGC 
TGTATAGTGT 
ATGCAAAAGG 
GCTAA 



TTGAAACTTG 
TATCTGAAGT 
ACATTGGGTG 
GAGCTAAATT 
GAGGATCAGC 
TAGGGGGCTA 
GTCATTATTC 
ATGTTCATAA 
CTTAAAATAT 
AAGCATGATT 
CCAGTTTTCC 
GGAGGCAGCT 
GCTCTCATTT 
CTACATTATC 
TATCTGAAAA 
TCTCACATTC 
GTTACTACTG 
TATTTTAAAT 
CTGCAAGATA 
TTTTTTTGTG 
TTTAGAGTTA 
TAAAAATTAA 
GGTAATAAAG 



GAATATATTG 
GTCTAATTTT 
TTTTTGTTTT 
CTGTTAAAAT 
ATATGAAATT 
CACGGTTGCT 
TAAAAATTGT 
TCCACCATGA 
TCAATTCTGC 
ATACAGGCCT 
TGCCTTCTGT 
GG AATGGCAA 
ACTAATAACA 
ACAAAATTAT 
CAGACTTTAA 
TGAAAAATAA 
AAGATAATTT 
GACAGAACTA 
GATAGGATGA 
TGTGGTTTTT 
AATAACAGAT 
AG CTT AAAAG 
ACTGCAACAT 
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F ig ur e 2$ 



•NSAAAEF»L 

KGKLQELGAY 
LHGVLDKLRS 
PLAIPSARPE 
APSEDVIDIK 
RPPASRNADS 
LNSLEETYSP 
EQDDDYGSRT 
VTKTTNYSTV 
ETKGDSVEKN 
CKAFPNCKFA 
PAPPSSSQLC 
PRHALKWIRP 
LS'YEVLLPI 
IWGMFVHCCC 
TTFTFPKDYI 
LRLSATFGKV 
♦CLLNNFFGK 
HSQNPQKIYL 
NFWPTVLLTE 
WRKVD«CKR 



RGRSPRQPPG 
VDEELPDYIM 
VTTEPSSLKS 
KRDSRVSTSS 
PEPDDLIDED 
GVHLNRLQFQ 
FFRNNSEKMS 
GSISSSVSVP 
PQKQTLPVAP 
QAEMSELSVA 
EKCLFVHPNC 
RYFPACKKME 
QTSE«HPVLP 
YLKCLIFQVC 
EDQHMKLTSW 
MFIIHHENSI 
PVFLPSVTG* 
LHYHKIIQIF 
VTTEDNF«NV 
FFCVWFLKXjIj 
GNKDCNILRT 



KPSAAQC^VP 
VMVANKKSQD 
SDTNIFDSNV 
QESKTTNVRQ 
LNFVQENPLS 
QQQNSIHAAK 
MEDENFRKRK 
AKPERRPSLP 
RTRTSQEELL 
QKPEKLLERC 
KYDAKCTKPD 
CPFYHPKHCR 
GRRSCSLEVF 
KFIMWF*HWV 
LVMVLQLRGL 
GQRY«GCLKY 
MRWVWTVEAA 
LQVFTYCI* K 
KIRFIOYILN 
RQEVSNALEL 
KLNC«" 



ARRRAMEIGT 
QMTEDLSLFL 
PSNKNNFSRG 
TYDDGAATRL 
QKEPTVTLTY 
QLDMQSSWVY 
LPWSSWKV 
PSKQANKNLI 
AEWQGQSRT 
KYWPACKNGD 
CPFTHVSRRI 
FNTQCTSPDC 
MY* •KILYRT 
FLFCFYYEKT 
HGCCVSGEMQ 
SILLFNF*VN 
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MEIGT ETSRKIRSAI 

KGKLQELGAY VDEELPDYIM VMVANKKSQD QMTEDLSLFL GNNTIRFTVW 
LHGVLDKLRS VTTEPSSLKS SDTNIFDSNV PSNKNNFSRG DERRHEAAVP 
PLAIPSARPE KRDSRVSTSS QESKTTNVRQ TYDDGAATRL MSTVKPLREP 
APSEDVIDIK PEPDDLIDED LNFVQENPLS QKEPTVTLTY GSSRPSIEIY 
RPPASRNADS GVHLNRLQFQ QQQNSIHAAK QLDMQSSWVY ETGRLCEPEV 
LNSLEETYSP FFRNNSEKMS MEDENFRKRK LPWSSWKV KKFNHDGEEE 
EGDDDYGSRT GSISSSVSVP AKPERRPSLP PSKQANKNLI LKAISEAQES 
VTKTTNYSTV PQKQTLPVAP RTRTSQEELL AEWQGQSRT PRISPPIKEE 
ETKGDSVEKN QAEMSELSVA QKPEKLLERC KYWPACKNGD ECAYHHPISP 
CKAFPNCKFA EKCLFVHPNC KYDAKCTKPD CPFTHVSRRI PVLSPKPVAP 
PAPPSSSQLC RYFPACKKME CPFYHPKHCR FNTQCTSPDC TFYHPTINVP 
PRHALKWIRP QTSE 
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Defective images within this document are accurate representations of the original 
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13 BLACK BORDERS 

□ IMAGE CUT OFF AT TOP, BOTTOM OR SIDES 

□ FADED TEXT OR DRAWING 

□ BLURRED OR ILLEGIBLE TEXT OR DRAWING 

□ SKEWED/SLANTED IMAGES 

□ COLOR OR BLACK AND WHITE PHOTOGRAPHS 

□ GRAY SCALE DOCUMENTS 

y LINES OR MARKS ON ORIGINAL DOCUMENT 

□ REFERENCE(S) OR EXHIBIT(S) SUBMITTED ARE POOR QUALITY 

□ OTHER: 

IMAGES ARE BEST AVAILABLE COPY. 
As rescanning these documents will not correct the image 
problems checked, please do not report these problems to 
the IFW Image Problem Mailbox. 



